Skip to content

Commit

Permalink
gemini
Browse files Browse the repository at this point in the history
  • Loading branch information
Cyclenerd committed Dec 14, 2023
1 parent 093507a commit 10898c5
Show file tree
Hide file tree
Showing 3 changed files with 25 additions and 11 deletions.
17 changes: 13 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,18 @@
[![Bagde: OpenAI](https://img.shields.io/badge/OpenAI-%23412991.svg?logo=openai&logoColor=white)](#readme)
[![Bagde: Python](https://img.shields.io/badge/Python-3670A0?logo=python&logoColor=ffdd54)](#readme)

This project is a drop-in replacement REST API for Vertex AI that is compatible with the OpenAI API specifications.
This project is a drop-in replacement REST API for Vertex AI (**PaLM 2, Codey, Gemini**) that is compatible with the OpenAI API specifications.

Examples:

| Chat with Bard in Chatbot UI | Get help from Bard in VSCode |
| Chat with Gemini in Chatbot UI | Get help from Gemini in VSCode |
|-----------------------------------------------------------|---------------------------------------------------|
| ![Screenshot: Chatbot UI chat](./img/chatbot-ui-chat.png) | ![Screenshot: VSCode chat](./img/vscode-chat.png) |

This project is inspired by the idea of [LocalAI](https://github.com/go-skynet/LocalAI)
but with the focus on making [Google Cloud Platform Vertex AI PaLM](https://ai.google/) more accessible to anyone.

A Google Cloud Run service is installed that translates the OpenAI API calls to Vertex AI (PaLM).
A Google Cloud Run service is installed that translates the OpenAI API calls to Vertex AI (PaLM 2, Codey, Gemini).

<p align="center">
<picture>
Expand Down Expand Up @@ -127,7 +127,16 @@ export OPENAI_API_KEY="sk-XYZ"
uvicorn vertex:app --reload
```

Or run with `codechat-bison-32k` 32k model:
Run with Gemini `gemini-pro` model:

```bash
export DEBUG="True"
export OPENAI_API_KEY="sk-XYZ"
export MODEL_NAME="gemini-pro"
uvicorn vertex:app --reload
```

Run with Codey `codechat-bison-32k` model:

```bash
export DEBUG="True"
Expand Down
14 changes: 7 additions & 7 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
fastapi==0.103.0
uvicorn==0.23.2
pydantic==1.10.12
sse-starlette==1.6.5
langchain==0.0.329
transformers==4.32.1
google-cloud-aiplatform==1.31.1
fastapi==0.105.0
uvicorn==0.24.0
pydantic==1.10.13
sse-starlette==1.8.2
langchain==0.0.350
transformers==4.36.1
google-cloud-aiplatform==1.38.1
5 changes: 5 additions & 0 deletions vertex.py
Original file line number Diff line number Diff line change
Expand Up @@ -281,6 +281,8 @@ async def chat_completions(body: ChatBody, request: Request):
top_p = float(body.top_p or default_top_p)
max_output_tokens = int(body.max_tokens or default_max_output_tokens)
# Note: Max output token:
# - gemini-pro: 8192
# https://cloud.google.com/vertex-ai/docs/generative-ai/model-reference/gemini
# - chat-bison: 1024
# - codechat-bison: 2048
# - ..-32k: The total amount of input and output tokens adds up to 32k.
Expand All @@ -289,6 +291,9 @@ async def chat_completions(body: ChatBody, request: Request):
if model_name == 'codechat-bison':
if max_output_tokens > 2048:
max_output_tokens = 2048
elif model_name.find("gemini-pro"):
if max_output_tokens > 8192:
max_output_tokens = 8192
elif model_name.find("32k"):
if max_output_tokens > 16000:
max_output_tokens = 16000
Expand Down

0 comments on commit 10898c5

Please sign in to comment.