`bgc_data_processing.core.loaders.base`¶

Base Loaders.

`BaseLoader(provider_name, category, exclude, variables)` ¶

Bases: ABC

Base class to load data.

Parameters:

Name	Type	Description	Default
`provider_name`	`str`	Data provider name.	required
`category`	`str`	Category provider belongs to.	required
`exclude`	`list[str]`	Filenames to exclude from loading.	required
`variables`	`SourceVariableSet`	Storer object containing all variables to consider for this data, both the one in the data file but and the one not represented in the file.	required

Source code in src/bgc_data_processing/core/loaders/base.py

def __init__(
    self,
    provider_name: str,
    category: str,
    exclude: list[str],
    variables: "LoadingVariablesSet",
) -> None:
    self._provider = provider_name
    self._category = category
    self._exclude = exclude
    self._variables = variables

`provider: str` `property` ¶

_provider attribute getter.

Returns:

Type	Description
`str`	data provider name.

`category: str` `property` ¶

Returns the category of the provider.

Returns:

Type	Description
`str`	Category provider belongs to.

`variables: LoadingVariablesSet` `property` ¶

_variables attribute getter.

Returns:

Type	Description
`LoadingVariablesSet`	Loading variables storer.

`excluded_filenames: list[str]` `property` ¶

Filenames to exclude from loading.

`is_file_valid(filepath)` ¶

Indicate whether a file is valid to be kept or not.

Parameters:

Name	Type	Description	Default
`filepath`	`Path \| str`	Name of the file	required

Returns:

Type	Description
`bool`	True if the name is not to be excluded.

Source code in src/bgc_data_processing/core/loaders/base.py

def is_file_valid(self, filepath: Path | str) -> bool:
    """Indicate whether a file is valid to be kept or not.

    Parameters
    ----------
    filepath : Path | str
        Name of the file

    Returns
    -------
    bool
        True if the name is not to be excluded.
    """
    keep_path = str(filepath) not in self.excluded_filenames
    keep_name = Path(filepath).name not in self.excluded_filenames

    return keep_name and keep_path

`load(filepath)` `abstractmethod` ¶

Load data.

Returns:

Type	Description
`Any`	Data object.

Source code in src/bgc_data_processing/core/loaders/base.py

@abstractmethod
def load(self, filepath: str) -> pd.DataFrame:
    """Load data.

    Returns
    -------
    Any
        Data object.
    """
    ...

`remove_nan_rows(df)` ¶

Remove rows.

Parameters:

Name	Type	Description	Default
`df`	`DataFrame`	DatafRame on which to remove rows.	required

Returns:

Type	Description
`DataFrame`	DataFrame with rows removed

Source code in src/bgc_data_processing/core/loaders/base.py

def remove_nan_rows(self, df: pd.DataFrame) -> pd.DataFrame:
    """Remove rows.

    Parameters
    ----------
    df : pd.DataFrame
        DatafRame on which to remove rows.

    Returns
    -------
    pd.DataFrame
        DataFrame with rows removed
    """
    # Load keys
    vars_to_remove_when_any_nan = self._variables.to_remove_if_any_nan
    vars_to_remove_when_all_nan = self._variables.to_remove_if_all_nan
    # Check for nans
    if vars_to_remove_when_any_nan:
        any_nans = df[vars_to_remove_when_any_nan].isna().any(axis=1)
    else:
        any_nans = pd.Series(False, index=df.index)
    if vars_to_remove_when_all_nan:
        all_nans = df[vars_to_remove_when_all_nan].isna().all(axis=1)
    else:
        all_nans = pd.Series(False, index=df.index)
    # Get indexes to drop
    indexes_to_drop = df[any_nans | all_nans].index
    return df.drop(index=indexes_to_drop)

bgc_data_processing.core.loaders.base¶

BaseLoader(provider_name, category, exclude, variables) ¶

provider: str property ¶

category: str property ¶

variables: LoadingVariablesSet property ¶

excluded_filenames: list[str] property ¶

is_file_valid(filepath) ¶

load(filepath) abstractmethod ¶

remove_nan_rows(df) ¶

`bgc_data_processing.core.loaders.base`¶

`BaseLoader(provider_name, category, exclude, variables)` ¶

`provider: str` `property` ¶

`category: str` `property` ¶

`variables: LoadingVariablesSet` `property` ¶

`excluded_filenames: list[str]` `property` ¶

`is_file_valid(filepath)` ¶

`load(filepath)` `abstractmethod` ¶

`remove_nan_rows(df)` ¶