-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Import methods use
detect_types
to detect if autoconversion from `s…
…tr` to `int`, `float` and `bool` is needed
- Loading branch information
Showing
4 changed files
with
88 additions
and
51 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,8 @@ | ||
# tSQLike | ||
|
||
[![Python package](https://github.com/mezantrop/tSQLike/actions/workflows/python-package.yml/badge.svg)](https://github.com/mezantrop/tSQLike/actions/workflows/python-package.yml) | ||
[![CodeQL](https://github.com/mezantrop/tSQLike/actions/workflows/codeql.yml/badge.svg)](https://github.com/mezantrop/tSQLike/actions/workflows/codeql.yml) | ||
|
||
## SQL-like interface to tabular structured data | ||
|
||
**Not that early stage, but still in development: may contain bugs** | ||
|
@@ -9,7 +11,7 @@ | |
|
||
## Description | ||
|
||
**tSQLike** is a Python3 module that is written with a hope to make tabular data process easier using SQL-like primitives. | ||
**tSQLike** is a Python3 module that is written with a hope to make tabular data process easier using SQL-like primitives. | ||
|
||
## Usage | ||
|
||
|
@@ -34,16 +36,18 @@ t3.write_csv(dialect='unix') | |
|
||
## Installation | ||
|
||
``` | ||
```sh | ||
pip install tsqlike | ||
``` | ||
|
||
## Functionality | ||
|
||
### Table class | ||
The main class of the module | ||
|
||
The main class of the module | ||
|
||
#### Data processing methods | ||
|
||
| Name | Status | Description | | ||
|-------------|---------|--------------------------------------------------------------------------| | ||
| `join` | ☑ | Join two Tables (`self` and `table`) on an expression [*](#Warning) | | ||
|
@@ -54,20 +58,23 @@ The main class of the module | |
| `group_by` | ☑ | GROUP BY primitive of SQL SELECT to apply aggregate function on a column | | ||
|
||
#### Import methods | ||
|
||
| Name | Status | Description | | ||
|---------------------|---------|-------------------------------------------------------------------------| | ||
| `import_dict_lists` | ☑ | Import a dictionary of lists into Table object | | ||
| `import_dict_lists` | ☑ | Import a dictionary of lists into Table object | | ||
| `import_list_dicts` | ☑ | Import a list of horizontal arranged dictionaries into the `Table` | | ||
| `import_list_lists` | ☑ | Import `list(list_1(), list_n())` with optional first row as the header | | ||
|
||
#### Export methods | ||
|
||
| Name | Status | Description | | ||
|---------------------|---------|-------------------------------------------------------------------------| | ||
| `export_dict_lists` | ☑ | Export a dictionary of lists | | ||
| `export_list_dicts` | ☑ | Export list of dictionaries | | ||
| `export_list_lists` | ☑ | Export `list(list_1(), list_n())` with optional first row as the header | | ||
|
||
#### Write methods | ||
|
||
| Name | Status | Description | | ||
|-----------------|---------|---------------------------------------------------------------------| | ||
| `write_csv` | ☑ | Make `CSV` from the `Table` object and write it to a file or stdout | | ||
|
@@ -76,12 +83,13 @@ The main class of the module | |
| `write_xml` | ☐ | Write `XML`. NB: Do we need this? | | ||
|
||
#### Private methods | ||
|
||
| Name | Status | Description | | ||
|----------------|---------|-------------------------------------------| | ||
| `_redimension` | ☑ | Recalculate dimensions of the Table.table | | ||
|
||
|
||
### EvalCtrl class | ||
|
||
Controls what arguments are available to `eval()` function | ||
|
||
| Name | Status | Description | | ||
|
@@ -91,23 +99,27 @@ Controls what arguments are available to `eval()` function | |
| `blacklist_remove` | ☑ | Remove the word from the blacklist | | ||
|
||
### Standalone functions | ||
| Name | Status | Description | | ||
|--------------|---------|-------------------------------------------| | ||
| `open_file` | ☑ | Open a file | | ||
| `close_file` | ☑ | Close a file | | ||
| `read_json` | ☑ | Read `JSON` file | | ||
| `read_csv` | ☑ | Read `CSV` file | | ||
| `read_xml` | ☐ | Read `XML`. NB: Do we need XML support? | | ||
|
||
#### WARNING! | ||
Methods `Table.join(on=)`, `Table.select(where=)` and `Table.write_json(export_f=)`, use `eval()` function | ||
to run specified expressions within the program. **ANY** expression, including one that is potentially **DANGEROUS** | ||
|
||
| Name | Status | Description | | ||
|--------------|---------|----------------------------------------------------------| | ||
| `open_file` | ☑ | Open a file | | ||
| `close_file` | ☑ | Close a file | | ||
| `read_json` | ☑ | Read `JSON` file | | ||
| `read_csv` | ☑ | Read `CSV` file | | ||
| `read_xml` | ☐ | Read `XML`. NB: Do we need XML support? | | ||
| `to_type` | ☑ | Convert a string to a proper type: int, float or boolean | | ||
|
||
#### WARNING | ||
|
||
Methods `Table.join(on=)`, `Table.select(where=)` and `Table.write_json(export_f=)`, use `eval()` function | ||
to run specified expressions within the program. **ANY** expression, including one that is potentially **DANGEROUS** | ||
from security point of view, can be passed as the values of the above arguments. It is your duty to ensure correctness | ||
and safety of these arguments and `EvalCtrl` helps to block potentially dangerous function/method names. | ||
and safety of these arguments and `EvalCtrl` helps to block potentially dangerous function/method names. | ||
|
||
Alternatively you can use `Table.join_lt()`, `Table.select_lt()` and `Table.write_json()`. They are significantly less | ||
powerful, but do not use `eval()`. | ||
|
||
## Contacts | ||
If you have an idea, a question, or have found a problem, do not hesitate to open an issue or mail me directly: | ||
|
||
If you have an idea, a question, or have found a problem, do not hesitate to open an issue or mail me directly: | ||
Mikhail Zakharov <[email protected]> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,3 @@ | ||
""" Version number in a single place """ | ||
|
||
__version__ = "1.0.4" | ||
__version__ = "1.1.0" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -92,16 +92,29 @@ def close_file(file): | |
if file and file is not sys.stdout and file is not sys.stdin: | ||
file.close() | ||
|
||
# ------------------------------------------------------------------------------------------------ # | ||
def to_type(s): | ||
""" Convert string s to a proper type: int, float or boolean """ | ||
|
||
if s in ('True', 'true', 'False', 'false'): # to boolean | ||
return bool(s) | ||
|
||
try: | ||
return float(s) if '.' in s or ',' in s else int(s) # to float and int | ||
except (ValueError, TypeError): | ||
return s # no conversion possible -> string | ||
|
||
# ------------------------------------------------------------------------------------------------ # | ||
def read_csv(in_file=None, encoding=None, newline='', name='', dialect='excel', **fmtparams): | ||
def read_csv(in_file=None, encoding=None, newline='', name='', detect_types=False, | ||
dialect='excel', **fmtparams): | ||
""" | ||
Read CSV from a file and import into a Table object | ||
:param in_file: Filename to read CSV from | ||
:param encoding: Character encoding | ||
:param newline: UNIX/Windows/Mac style line ending | ||
:param name: Table name to assign | ||
:param detect_types: Detect and correct types of data, default - False | ||
:param dialect: CSV dialect, e.g: excel, unix | ||
:**fmtparams: Various optional CSV parameters: | ||
:param delimiter: CSV field delimiter | ||
|
@@ -112,18 +125,19 @@ def read_csv(in_file=None, encoding=None, newline='', name='', dialect='excel', | |
|
||
f = open_file(in_file, file_mode='r', encoding=encoding, newline=newline) | ||
_data = csv.reader(f, dialect=dialect, **fmtparams) | ||
t = Table(data=list(_data), name=name) | ||
t = Table(data=list(_data), name=name, detect_types=detect_types) | ||
close_file(f) | ||
return t | ||
|
||
|
||
# -------------------------------------------------------------------------------------------- # | ||
def read_json(in_file=None, name=''): | ||
def read_json(in_file=None, name='', detect_types=False): | ||
""" Read JSON data from file | ||
:param in_file: Filename to read JSON from | ||
:param name: Table name to assign | ||
:return Table | ||
:param in_file: Filename to read JSON from | ||
:param name: Table name to assign | ||
:param detect_types: Detect and correct types of data, default - False | ||
:return Table | ||
""" | ||
|
||
_data = {} | ||
|
@@ -132,7 +146,7 @@ def read_json(in_file=None, name=''): | |
_data = json.load(f) | ||
except (IOError, OSError) as _err: | ||
print(f'[email protected]_json(): Unable to load JSON structure: {_err}') | ||
t = Table(data=_data, name=name) | ||
t = Table(data=_data, name=name, detect_types=detect_types) | ||
close_file(f) | ||
return t | ||
|
||
|
@@ -201,7 +215,7 @@ class Table: | |
""" | ||
|
||
# -------------------------------------------------------------------------------------------- # | ||
def __init__(self, data=None, name=None): | ||
def __init__(self, data=None, name=None, detect_types=False): | ||
self.timestamp = int(time.time()) | ||
self.name = name or str(self.timestamp) | ||
|
||
|
@@ -212,13 +226,13 @@ def __init__(self, data=None, name=None): | |
self.cols = 0 | ||
elif isinstance(data, list) and len(data): | ||
if isinstance(data[0], dict): # list(dicts()) | ||
self.import_list_dicts(data) | ||
self.import_list_dicts(data, detect_types=detect_types) | ||
if isinstance(data[0], list): # list(lists()) | ||
self.import_list_lists(data) | ||
self.import_list_lists(data, detect_types=detect_types) | ||
elif isinstance(data, dict) and len(data): | ||
print(type(next(iter(data)))) | ||
if isinstance(data[next(iter(data))], list): # dict(lists()): | ||
self.import_dict_lists(data) | ||
self.import_dict_lists(data, detect_types=detect_types) | ||
else: | ||
raise ValueError('FATAL@Table.__init__: Unexpected data format') | ||
|
||
|
@@ -244,14 +258,15 @@ def _redimension(self): | |
self.cols = self.rows and len(self.table[0]) or 0 | ||
|
||
# -- Import methods -------------------------------------------------------------------------- # | ||
def import_list_dicts(self, data, name=None): | ||
def import_list_dicts(self, data, name=None, detect_types=False): | ||
""" | ||
Import a list of dictionaries | ||
:alias: import_thashes() | ||
:param data: Data to import formatted as list of dictionaries | ||
:param name: If not None, set it as the Table name | ||
:return: self | ||
:alias: import_thashes() | ||
:param data: Data to import formatted as list of dictionaries | ||
:param name: If not None, set it as the Table name | ||
:param detect_types: Detect and correct types of data, default - False | ||
:return: self | ||
""" | ||
|
||
# Set a new Table name if requested | ||
|
@@ -262,7 +277,8 @@ def import_list_dicts(self, data, name=None): | |
self.header = [self.name + TNAME_COLUMN_DELIMITER + str(f) | ||
if TNAME_COLUMN_DELIMITER not in str(f) else f for f in (data[0].keys())] | ||
|
||
self.table = [list(r.values()) for r in data] | ||
self.table = [list(r.values()) for r in data] if not detect_types else [[to_type(v) for v in r.values()] for r in data] | ||
|
||
else: | ||
raise ValueError('[email protected]_list_dicts: Unexpected data format') | ||
|
||
|
@@ -272,7 +288,7 @@ def import_list_dicts(self, data, name=None): | |
return self | ||
|
||
# -------------------------------------------------------------------------------------------- # | ||
def import_dict_lists(self, data, name=None): | ||
def import_dict_lists(self, data, name=None, detect_types=False): | ||
""" | ||
Import a dictionary of lists | ||
""" | ||
|
@@ -290,7 +306,7 @@ def import_dict_lists(self, data, name=None): | |
|
||
for c, f in enumerate(data.keys()): | ||
for r, v in enumerate(data[f]): | ||
self.table[r][c] = v | ||
self.table[r][c] = v if not detect_types else to_type(v) | ||
self._redimension() | ||
else: | ||
raise ValueError('[email protected]_dict_lists: Unexpected data format') | ||
|
@@ -299,14 +315,15 @@ def import_dict_lists(self, data, name=None): | |
return self | ||
|
||
# -------------------------------------------------------------------------------------------- # | ||
def import_list_lists(self, data, header=True, name=None): | ||
def import_list_lists(self, data, header=True, name=None, detect_types=False): | ||
""" | ||
Import list(list_1(), list_n()) with optional first row as the header | ||
:param data: Data to import formatted as list of lists | ||
:param header: If true, data to import HAS a header | ||
:param name: If not None, set it as the Table name | ||
:return: self | ||
:param data: Data to import formatted as list of lists | ||
:param header: If true, data to import HAS a header | ||
:param name: If not None, set it as the Table name | ||
:param detect_types: Detect and correct types of data, default - False | ||
:return: self | ||
""" | ||
|
||
# Set a new Table name if requested | ||
|
@@ -315,7 +332,11 @@ def import_list_lists(self, data, header=True, name=None): | |
|
||
if isinstance(data, list) and len(data) and isinstance(data[0], list): | ||
# TODO: Check all rows to be equal length | ||
self.table = data[1:] if header else data | ||
if not detect_types: | ||
self.table = data[1:] if header else data | ||
else: | ||
self.table = [[to_type(v) for v in r] for r in data[1:]] | ||
|
||
self._redimension() | ||
|
||
# If table header is not properly initiated, make each column: "name.column" | ||
|
@@ -634,17 +655,18 @@ def select_lt(self, columns='*', where='', comp='==', val='', new_tname=''): | |
data=r_table + [[r[c] for c in r_columns] for r in self.table]) | ||
|
||
scol_idx = self.header.index(where) | ||
_type = type(val) | ||
return Table(name=new_tname if new_tname else | ||
self.name + TNAME_TNAME_DELIMITER + str(self.timestamp), | ||
data=r_table + [[r[c] for c in r_columns] | ||
for r in self.table | ||
if comp == '==' and r[scol_idx] == val or | ||
comp == '!=' and r[scol_idx] != val or | ||
comp == '>' and r[scol_idx] > val or | ||
comp == '>=' and r[scol_idx] >= val or | ||
comp == '<=' and r[scol_idx] <= val or | ||
comp == 'in' and r[scol_idx] in val or | ||
comp == 'not in' and r[scol_idx] not in val]) | ||
if comp == '==' and _type(r[scol_idx]) == val or | ||
comp == '!=' and _type(r[scol_idx]) != val or | ||
comp == '>' and _type(r[scol_idx]) > val or | ||
comp == '>=' and _type(r[scol_idx]) >= val or | ||
comp == '<=' and _type(r[scol_idx]) <= val or | ||
comp == 'in' and _type(r[scol_idx]) in val or | ||
comp == 'not in' and _type(r[scol_idx]) not in val]) | ||
|
||
# -------------------------------------------------------------------------------------------- # | ||
def order_by(self, column='', direction=ORDER_BY_INC, new_tname=''): | ||
|