Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Multiple Database Servers, supply db and creds via params #2716

Open
MarkCupitt opened this issue Mar 20, 2023 · 24 comments
Open

Support Multiple Database Servers, supply db and creds via params #2716

MarkCupitt opened this issue Mar 20, 2023 · 24 comments
Assignees
Labels
feedback: captured Feedback in issue has been captured needs: requirements The problem is clear and worth solving, but we're not yet sure of the best solution type: enhancement New feature or request user reported Reported by a Mathesar user work: backend Related to Python, Django, and simple SQL
Milestone

Comments

@MarkCupitt
Copy link

Problem

We run a Kubernetes Admin cluster that maintains a number of child clusters, each with their own postgres databases.

As we understand it, mathesar is deployed on a one -> one relationship per database.

Proposed solution

IT would be very useful to be able to select which server one wanted to connect to, opr even better, pass in the server required of a single running matheus instance, eg: via Header, querystring or whatever.

There user/password should be bale to be (optionally), we understand the security issues) be able to be included, along with a specific database to access. Obviously we could match read/only creds to a database or read/write depending on the users role

@MarkCupitt MarkCupitt changed the title Support Multiple Database Servers Support Multiple Database Servers, supply db and creds via params Mar 20, 2023
@rajatvijay rajatvijay added work: backend Related to Python, Django, and simple SQL status: draft user reported Reported by a Mathesar user and removed status: triage labels Mar 20, 2023
@rajatvijay
Copy link
Contributor

@mathemancer @kgodey can you take a look?

@kgodey
Copy link
Contributor

kgodey commented Mar 20, 2023

@MarkCupitt We already have some support for this (you can configure any number of databases to connect to Mathesar). Our existing permissions system should be able to handle users with different levels of permissions to different databases.

The main things missing are documenting how to set up multiple databases and UI to switch between databases. We're tracking this in #2006.

Do you think that covers your use case or is there anything missing?

@kgodey kgodey added this to the v0.1.2 milestone Mar 20, 2023
@MarkCupitt
Copy link
Author

MarkCupitt commented Mar 21, 2023

Hi @kgodey I think it will go close so let me describe a perfect world scenario from a laymans standpoint

We are using Teleport to protect our operational and production systems, an it would probably be a very good idea if you could support it. Teleport is a secure proxy, that implements RBAC to determine who gets to see whats behind it.

When it calls an application, it provides a [ RFD ] JWT in a header called teleport-jwt-assertion which allows for a good level of user cred integration and role propagation

We would love to leverage this capability to extend roles and permissions into mathesar. As you are very aware, database security is very difficult to implement effectively without handing out passwords, so the ability to map a role supplied by teleport to a database user, schema and password to access the db would be incredibly useful as long as it was secure and encrypted in storage

We would see us connecting to a single mathesar instance, and a list of databases that the users RBAC gave them permissions to would display, along with a RW RO OWNER indicator, clicking on the database would switch to what is there now, which is super by the way

It could also use the logged in teleport user as the database user, but in our case, we want to go with generic users controlled by rbac to manage access

A use case like above would be incredibly useful, an may generate a lot of interest among the community.

We are very concerned at using a UI like PGadmin 4 that requires us to hand out creds before it can be used, it makes cred management super hard, and we identify that issue as a strategic weakness we have to address and would dearly love a web based UI we could securely integrate in our Kubernetes Admin Clusters

@MarkCupitt
Copy link
Author

To add a further very useful capability, being able to extract the creds from something like Hashicorp vault (it has an api) would be VERY useful, as it would then allow for automatic password rotations by the vault and secure, enterprise password storage for your application as well .. Vault is very popular and we use it for a lot

@kgodey kgodey modified the milestones: v0.1.2, v0.1.3 Mar 21, 2023
@kgodey
Copy link
Contributor

kgodey commented Mar 21, 2023

Thanks @MarkCupitt, this is very helpful. This seems like a cross between UI for multiple databases and SSO support, which is on our roadmap. I'll need do some thinking and research about how best to integrate this into Mathesar.

@ghislaineguerin ghislaineguerin added the feedback: captured Feedback in issue has been captured label Mar 29, 2023
@kgodey kgodey modified the milestones: v0.1.3, Next release, Backlog Jul 26, 2023
@Anish9901 Anish9901 added needs: product approval It's not yet clear that this issue will actually improve Mathesar from a user's perspective needs: implementation specs We need clarity on HOW we'll implement it from a technical perspective needs: backend approval The backend team might not agree on whether this makes sense for the codebase and removed status: draft labels Jan 15, 2024
@seancolsen
Copy link
Contributor

@MarkCupitt I want to follow up with you about this feature request.

Mathesar 0.1.4 (released last week) now has a new feature that allows you to configure multiple connections to separate databases. You can see some screenshots of this feature within our release notes.

Would you be interesting in taking a look at this new feature and letting us know how closely it matches the needs you've described here?

@seancolsen seancolsen added the needs: user feedback We are waiting for a user to answer questions or provide feedback on our fix label Feb 5, 2024
@MarkCupitt
Copy link
Author

@seancolsen Many thanks, in fact, very timely, I will be upgrading that specific operational system over next two weeks, so I will revisit the install and give you some feedback on how it fits for us ..

@MarkCupitt
Copy link
Author

@seancolsen has there been any attempts that you are aware of to support mathesar in Kubernetes?

@MarkCupitt
Copy link
Author

Ah, just found this, we wil see if we can add some value to it

#2707

@MarkCupitt
Copy link
Author

@seancolsen I took look at this and it goes a lot of the way to what we need.

In our use case, we manage multiple databases, long term we are expecting to have 50+ databases we have to maintain and provide different levels of access to. Teleport allows us to issue short term creds for access for contractors and the like, on demand, the contractor could be a different person each time ..

We have two problems:

  1. restrict user access to different databases based on roles, eg Read Only, and so on. Some users require RW or Owner Privileges to some databases, and not on others

Integrating these user roles with an SSO like teleport, where roles can be managed centrally.

We don't really want to have to manage two systems, with different logins, that will get very unwieldy for us quite quickly, and leaves a risk that something falls through the cracks when a user leaves or changes status

My initial Issue on this is at #2716

We will deploy 0.1.4 shortly and give it a test drive, and pass back additional thoughts

Thanks for the work, it looks great

@MarkCupitt
Copy link
Author

@seancolsen I have created a Kubernetes Deployment version of Mathesar, I removed Caddy from the mix, as its was not needed in our scenario, and the caddy file pointed to a reverse proxy that I named differently, and I am happily able to access the UI

4 things of note so far:

1):

I noticed somewhere that it was planned to remove the env var MATHESAR_DATABASES we found this extremely useful in a k8s environmentg, as it lets us populate databases as part of our ci/cd when we deploy a new one. Please do nto remove it, or even better, allow us to mount a yaml file describing the databases like this

MATHESAR_DATABASES:
- "eng-o2/bounty|postgres://postgres:DO1c@{{postgres_master_pooler_url}}/bounty?sslmode=disable"
- "eng-o2/prometheus|postgres://postgres:DO1c@{{postgres_master_pooler_url}}/prometheus?sslmode=disable"

And use the ENV VAR to point to the location of the yaml file in the file system. This should also work great with docker as well. We will have 50+ databases and I am not keen on having to enter them all manually in every instance of Mathesar we deploy

2):

I am unable to access the user page in the admin section

image

The logs indicate a migration problem, here is what we got. (The image was the latest in your dockerhub repo)

2024-02-28 02:13:21 +0000] [191] [INFO] Starting gunicorn 20.1.0
[2024-02-28 02:13:21 +0000] [191] [INFO] Listening at: http://0.0.0.0:8000 (191)
[2024-02-28 02:13:21 +0000] [191] [INFO] Using worker: sync
[2024-02-28 02:13:21 +0000] [192] [INFO] Booting worker with pid: 192
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regrole' of column 'owner'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regrole' of column 'owner'
  return Table(

3):

Default Admin Password, has ot be set at first login, we would like to be able to pass this as ENV Vars, so we can manage the creds via our secrets in Kubernetes. eg: ADMIN_USERNAME and ADMIN_PASSWORD

4):

Users, not critical, but VERY helpful, again to allow us to manage user creds and roles declaratively by K8s secrets. Perhaps using a YAML file like this

users:
  - name: admin
    password: pass
    type: admin
  - name: joe
    password: pass
    type: standard

It would also be nice to be able to declaratively map the users to the roles on a per database basis as the UI does, but this is likely a big ask

If there was some sql we could execute to achieve above, that would also work


Other than the bug, the system is quite useable and the requests will make it very useful in a K8s environment.

If we can get above sorted, Ill be quite happy to make our deployment manifests available for you guys to publish, Ill need to tidy up a bit, but I think it will be a viable

@MarkCupitt
Copy link
Author

@seancolsen Result form attempt to perform the migration manually in the container, looks the models amy have an issue or similar

python manage.py makemigrations
Migrations for 'mathesar':
  mathesar/migrations/0007_auto_20240228_0338.py
    - Alter field column_order on tablesettings
root@mathesar-6bdc4fdb87-hlpx9:/code# python manage.py migrate
Operations to perform:
  Apply all migrations: admin, auth, contenttypes, mathesar, sessions
Running migrations:
  Applying mathesar.0007_auto_20240228_0338... OK
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regrole' of column 'owner'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regtype' of column 'column_type'
  return Table(
/code/db/tables/operations/select.py:20: SAWarning: Did not recognize type 'regrole' of column 'owner'
  return Table(

@MarkCupitt
Copy link
Author

@seancolsen Im getting the XHR request response not valid JSON error on multiple pages

@seancolsen
Copy link
Contributor

Thanks for all this feedback, @MarkCupitt! We will have a maintainer look into this and see how we can help you out. I can't promise we'll be able to respond this week though, as we have multiple maintainers out in order to attend a PostgreSQL conference.

@MarkCupitt
Copy link
Author

MarkCupitt commented Feb 28, 2024 via email

@mathemancer mathemancer self-assigned this Feb 28, 2024
@mathemancer
Copy link
Contributor

mathemancer commented Feb 28, 2024

@MarkCupitt It would help me to diagnose this to know which versions of PostgreSQL DB servers you have hooked up to Mathesar right now. Are any of them < 13 or > 15?

Edit: Also, it would help if you have any more logs showing the problem. It seems like Mathesar is attempting to reflect some view or table (but given the types, I expect a view) that uses system catalog types which aren't supported by one of our dependencies (SQLAlchemy).

Finally, does the warning occur when a request is made, or just in the process of loading Mathesar? It seems like the latter from the logs you posted, but I'd like to confirm.

@MarkCupitt
Copy link
Author

MarkCupitt commented Feb 28, 2024 via email

@mathemancer
Copy link
Contributor

I had a couple more questions, but added them as an edit to my previous comment. I'm not sure how that works with the Github email comment setup, so I'll make them a separate comment to be sure.

It would help if you have any more logs showing the problem. It seems like Mathesar is attempting to reflect some view or table (but given the types, I expect a view) that uses system catalog types which aren't supported by one of our dependencies (SQLAlchemy).

Finally, does the warning occur when a request is made, or just in the process of loading Mathesar? It seems like the latter from the logs you posted, but I'd like to confirm.

@MarkCupitt
Copy link
Author

MarkCupitt commented Feb 29, 2024

@mathemancer I'm afraid that those logs are all there is, the logging output is pretty much zero. I tried to figure out how to enable debug and request logging but failed miserably... I can get a shell into the container, if you have some steps for me on how to get you better logs, I will be very happy to do it for you. I do have Django experience, but It was 10 years ago .. .

The warning is on UI requests, it's not every page, but it appears random. If I refresh the page enough times I can often get in, for example, when adding users, I set up a user, and on the next page load, the error occurs, but attempting to navigate back in, on several attempts I was able to get the page to load again... I can not see any pattern to it that makes sense to me

More than happy to perform whatever test you need to help out, just let me know, just so you know, I am GMT+8

@mathemancer
Copy link
Contributor

@MarkCupitt Do you mind if I email you? We could try to set up a call to debug early next week if that works for you. It might be the most efficient way forward.

@MarkCupitt
Copy link
Author

MarkCupitt commented Mar 1, 2024

No problem at all, please do .. we can do a screen share session if that helps, I see we are both GMT+8, I have sent an email to connect us via your "hello" generic email address

@kgodey
Copy link
Contributor

kgodey commented Mar 2, 2024

@MarkCupitt We got your email and I sent you a response with some availability. Thanks!

@seancolsen seancolsen modified the milestones: Backlog, High priority May 6, 2024
@seancolsen seancolsen added needs: requirements The problem is clear and worth solving, but we're not yet sure of the best solution and removed needs: user feedback We are waiting for a user to answer questions or provide feedback on our fix needs: product approval It's not yet clear that this issue will actually improve Mathesar from a user's perspective needs: implementation specs We need clarity on HOW we'll implement it from a technical perspective needs: backend approval The backend team might not agree on whether this makes sense for the codebase labels May 6, 2024
@seancolsen
Copy link
Contributor

Based on skimming this thread and chatting privately with @kgodey, here's my understanding of the next steps:

  1. We need to finish our backend refactor (which at this point probably means releasing Mathesar's beta version).

  2. Then we need to look into what it would take to implement this issue. This will entail some assessment of requirements, design, and implementation.

As such, I've re-labelled this issue so that we can come back to it later.

@kgodey
Copy link
Contributor

kgodey commented May 11, 2024

Thanks @seancolsen.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feedback: captured Feedback in issue has been captured needs: requirements The problem is clear and worth solving, but we're not yet sure of the best solution type: enhancement New feature or request user reported Reported by a Mathesar user work: backend Related to Python, Django, and simple SQL
Projects
No open projects
Development

No branches or pull requests

7 participants