Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Volumes not persisted between restarts #5

Closed
jonl-percsolutions-com opened this issue Apr 24, 2018 · 23 comments
Closed

Volumes not persisted between restarts #5

jonl-percsolutions-com opened this issue Apr 24, 2018 · 23 comments
Assignees

Comments

@jonl-percsolutions-com
Copy link

I am using this to deploy some applications via docker compose. However, containers marked as restart: always fail to start because the volumes error out as non-existent.

I installed with the following commands:

sudo docker plugin install --alias centos-nfs trajano/centos-mounted-volume-plugin --grant-all-permissions --disable
sudo docker plugin set centos-nfs PACKAGES=nfs-utils
sudo docker plugin set centos-nfs MOUNT_TYPE=nfs
sudo docker plugin set centos-nfs MOUNT_OPTIONS=hard,proto=tcp,nfsvers=4,intr
sudo docker plugin enable centos-nfs

My volume declaration is like such:

  webapp-logs:
    driver: centos-nfs
    driver_opts:
      device: host:logs/webapp

My docker compose command is like so:

sudo /usr/local/bin/docker-compose -f compose.yml up -d --build --force-recreate --remove-orphans
The resultant mount from docker inspect looks like so:

        "Mounts": [
            {
                "Type": "volume",
                "Name": "jistdeploy_webapp-logs",
                "Source": "/var/lib/docker/plugins/180d32f4982687ecfb6df714d95941749c0fc85c140d3e0180c9396775fa87cc/propagated-mount/13f96f9f26655ba3baddfc8e64d4bf812e54c1c5c68e24cd05081bbad0cba227",
                "Destination": "/var/log/webapps",
                "Driver": "centos-nfs:latest",
                "Mode": "rw",
                "RW": true,
                "Propagation": ""
            }
       ]

On restart of docker or the host I get the following error:

dockerd[1762]: time="2018-04-24T17:51:43.106196883Z" level=error msg="Failed to start container 3de3cad008484136b1c690c26fd46d17a7db80c01fc1187f4e9c2a9fac80b09d: get jistdeploy_webapp-logs: VolumeDriver.Get: volume jistdeploy_webapp-logs does not exist"

When stopping docker the "Source" location is removed.

Is there a way to make this persistent or have the plugin reconnect to the share on startup?

@trajano trajano self-assigned this Apr 24, 2018
@trajano
Copy link
Owner

trajano commented Apr 24, 2018

It should be persistent already otherwise it's a bug. Let me try to recreate it.

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

You don't have the rpcbind service started up I am guessing you don't need that for V4 NFS?

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

A temporary workaround I can think of is to define the volume externally like the following sample

docker volume create -d trajano/centos-mounted-volume-plugin \
    --opt device=192.168.1.1:/mnt/routerdrive/nfs nfsmountvolume

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

Hmm... I'm looking further into it, There may be something off when using compose. It works when doing a stack-deploy.

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

Question I have though, are you using compose because you're running on Windows/Mac to connect to a shared mount? Because I had issues with shared drives before on Windows due to a race condition between the volume and the container. docker/for-win#584

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

#3 is implemented now, I'm just pushing image up right this moment and https://hub.docker.com/r/trajano/nfs-volume-plugin/ should have it by then you may want to try that since it's a simpler interface and should load faster since nfs-utils is preloaded.

@jonl-percsolutions-com
Copy link
Author

You were correct. rpcbind was not running. Took awhile to figure out what was going on, some error in the startup process, perhaps because nothing "wanted" rpcbind on startup. So I created a work around by having the nfs-common.target want it. Also ran into some other issues with my scripts, but they are not all fixed and after testing, with rpcbind starting, there error is still occurring. IE on shutdown of docker the volumes are removed.

I am testing this in an AWS VM running centos 7.4.

@jonl-percsolutions-com
Copy link
Author

Will try the new build too.

@jonl-percsolutions-com
Copy link
Author

Issue still occurs in latest version. It also occurs if the volume is manually created with the docker volume create command.

If docker is shut down, it should be removing the volumes, I would think, however, the plugin would need to persist information about which volumes to restore on startup?

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

Yah I would think so too, but I don't see anything on the API spec that says that or where it is preserved. However, in my current setup for cifs and gluster (there's nothing special with regards to storage) I don't have anything that would retain it, but it restores it when the swarm comes back up. I wonder if that's a limitation of managed plugins only restoring themselves if deployed as a stack/service.

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

I wonder if you'd have better luck with https://github.com/ContainX/docker-volume-netshare that would require you to install the NFS binaries on the host and run it as a service.

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

I'm not the only one with the issue apparently sapk/docker-volume-gluster#6

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

I'll make one minor change and push it up with the global-cap, basically switch to use "global" rather than "local" in the capabilities, but I doubt that would do anything.

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

Hmm... so far so good...

[root@docker-engine glusterfs-volume-plugin]# docker plugin enable  trajano/nfs-volume-plugin:global-cap
trajano/nfs-volume-plugin:global-cap
[root@docker-engine glusterfs-volume-plugin]# docker volume create -d trajano/nfs-volume-plugin:global-cap --opt device=192.168.1.1:/mnt/routerdrive/nfs --opt nfsopts=hard,proto=tcp,nfsvers=3,intr,nolock nfsmountvolumeg
nfsmountvolumeg
[root@docker-engine glusterfs-volume-plugin]# docker volume ls
DRIVER                                 VOLUME NAME
trajano/nfs-volume-plugin:global-cap   nfsmountvolumeg
cifs:latest                            noriko/s/hath
cifs:latest                            noriko/s/letsencrypt
cifs:latest                            noriko/s/portainer
cifs:latest                            noriko/s/registry
cifs:latest                            noriko/s/site
gluster:latest                         trajano/nexus
[root@docker-engine glusterfs-volume-plugin]# systemctl restart docker
[root@docker-engine glusterfs-volume-plugin]# docker volume ls
DRIVER                                 VOLUME NAME
trajano/nfs-volume-plugin:global-cap   nfsmountvolumeg
cifs:latest                            noriko/s/hath
cifs:latest                            noriko/s/letsencrypt
cifs:latest                            noriko/s/portainer
cifs:latest                            noriko/s/registry
cifs:latest                            noriko/s/site
gluster:latest                         trajano/nexus

@jonl-percsolutions-com
Copy link
Author

btw I tried netshare but was having similar startup issues due to it being a full fledged service and not reliably, I believe, starting before docker did. Ran across your plugin in one of their issues. Thought the plugin approach was better, easier to maintain and if there were similar issues, would be easier to resolve the timing issue as it would be managed by docker.

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

I'm starting to think what you're looking for is not allowed by Docker. The timing issues was one of the reasons why I set up the managed plugin approach too but primarily for the CIFS side.

@jonl-percsolutions-com
Copy link
Author

The global-cap seems like it might be the route to take.

@trajano
Copy link
Owner

trajano commented Apr 24, 2018

It does not appear to work when I did a full restart. I think I need a way of persisting data into the plugin. And somehow restoring it.

@trajano
Copy link
Owner

trajano commented Apr 25, 2018

What I am thinking was storing the Volume Info map into a boltdb and store that as part of the plugin.

@jonl-percsolutions-com
Copy link
Author

Sounds like a plan. Not 100% sure about how plugins work, so just out of curiosity, what about storing the data in the Mounts portion of the plugin config.json, would that be possible? It would seem that might be what it is designed for.

@trajano
Copy link
Owner

trajano commented Apr 25, 2018

@jonl-percsolutions-com that seems to have worked. Only nfs-volume-plugin:latest has this right now. My test is to create the the volume, restart all the nodes and see if I can still mount it. Let me know if it works for you I will apply the changes to the others.

@trajano
Copy link
Owner

trajano commented May 21, 2018

Closing as the changes have been applied a while back.

@trajano trajano closed this as completed May 21, 2018
@jonl-percsolutions-com
Copy link
Author

Finally about to do another deployment after a month of dev. Updated this yesterday and appears to be working fantastically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants