Skip to content

Use the modern opportunity "Prompt Engineering" to do related experiments in computer vision.

Notifications You must be signed in to change notification settings

XinyueZ/cv-prompt-engineering

Repository files navigation

Awesome Computer Vision Prompting

Description

The repo is used for studying how to use Prompt Engineering for Computer Vision tasks.

Use state-of-art models like Diffusion or other baseline models to generate, inpaint, and paint images.

streamlit run basic_app.py --server.port 5555 --server.enableCORS false

User interaction with Segment Anything Model

streamlit run sam_app.py --server.port 5555 --server.enableCORS false

Inpaint via user-interaction with Diffusion and SAM

streamlit run sam_inpaint_app.py --server.port 5555 --server.enableCORS false

Image assets/van.jpg

Prompts:
  1. [Van] A Volkswagen California van, parked on a beach, with a surfboard on the roof.
  2. [Ground] Big grassland, a lot grass, green or grey grass.
  3. [Between sky and ground] Endless grassland.
  4. [Sky] Clear night sky with stars and full moon.

Few-shot of tracking objects via SAM in video

Use SAM to get mask of the object, use the mask of object to track through all frames.

streamlit run sam_tracker.py --server.port 5556 --server.enableCORS false

Video src: https://dl.dropbox.com/s/0lalmh95tylyw4s/sculpture.mp4

Working comments

I personally think that prompting is a new programming approach. Don’t assume that guiding models with natural language is easy. On the contrary, I believe it’s quite the opposite. Natural language programming lacks the syntax of traditional programming languages, which means there are no type checks or any protective mechanisms in place. If the model (AI) receives an inappropriate prompt, the generated results can be completely different from what was expected.

Here is a prompt I have used the Diffusion model in computer vision. Although it has brought some surprises, it is not actually my ultimate goal.

AI new trend, prompt engineering

Mouse interactive prompt engineering

Install via Docker

# setup
docker build --no-cache --tag cv-prompt-engineering -f Dockerfile .

# run
docker run --gpus all -v /home/ubuntu/work/cv-prompt-engineering/:/workspace/    -p 5555:5555 --rm  -it --shm-size=55gb -d cv-prompt-engineering tail -f /dev/null

Run

streamlit run basic_app.py --server.port 5555 --server.enableCORS false

About

Use the modern opportunity "Prompt Engineering" to do related experiments in computer vision.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published