Kong API Gateway with CUDO Compute
How to add authentication and SSL/TLS your AI API
Kong API Gateway is a scalable platform designed for managing, securing, and orchestrating APIs and microservices. Built on top of NGINX, it provides high performance and flexibility, handling API traffic with low latency. Kong offers a wide range of features including load balancing, rate limiting, authentication, logging, and monitoring, making it a comprehensive solution for API management.
In this guide we will use Kong API gateway to wrap an existing AI API with an HTTPS connection and key based authentication. If you run a web application on another cloud but wish to use CUDO Compute for deploying AI such as LLMs; this tutorial can show you how to create a secure connection between the clouds.
Prerequisites
- Create a project and add an SSH key
- Optionally download CLI tool
- Choose a VM with an NVIDIA GPU and Configure
- Use the Ubuntu 22.04 + NVIDIA drivers + Docker image (in CLI tool type
-image ubuntu-2204-nvidia-535-docker-v20240214
) - Start a VM with one or more GPUs
Start AI API
We will start a docker network and run a docker container with Ollama to deploy LLMs. Then we will run a second docker container with Kong API Gateway that will connect to Ollama. Kong is being run without a database, so it simply requires a yaml file.
SSH on to your CUDO GPU VM and create a docker network
docker network create kong-net
Serve Ollama API for LLMs, you can run whichever service you like just make sure to run it on the kong-net network and make a note of the name and the port:
sudo docker run --gpus=all --network=kong-net -d --name ollama -p 127.0.0.1:11434:11434 ollama/ollama
name: ollama port:11434
Make SSL Keys
On the CUDO VM create an SSL certificate, replace the IP with the CUDO VM IP address
mkdir kong
cd kong
openssl req -x509 -newkey rsa:4096 -keyout kong.key -out kong.crt -sha256 -days 3650 -nodes -subj '/CN=CUDO-IP-ADDRESS'
chmod 744 kong.key
chmod 744 kong.crt
Make a yaml file
This yaml file will configure kong to connect to the Ollama docker container. If you are using another service, change the name and port of your docker container in the url: http://ollama:11434
.
Here the key-auth kong plugin is used to add key based authentication. Swap my-key
for your secure key. Change the path to your desired path.
kong.yaml
_format_version: '3.0'
_transform: true
services:
- name: ollama
url: http://ollama:11434
routes:
- name: ollama-route
paths:
- /ollama
plugins:
- name: key-auth
consumers:
- username: kong-user
keyauth_credentials:
- key: my-key
Run Kong docker container
Run a detached docker container with Kong:
docker run -d --name kong-dbless \
--network=kong-net \
-v "$(pwd):/kong/" \
-e "KONG_DATABASE=off" \
-e "KONG_DECLARATIVE_CONFIG=/kong/kong.yaml" \
-e "KONG_SSL=on" \
-e "KONG_SSL_CERT=/kong/kong.crt" \
-e "KONG_SSL_CERT_KEY=/kong/kong.key" \
-e "KONG_PROXY_ACCESS_LOG=/dev/stdout" \
-e "KONG_ADMIN_ACCESS_LOG=/dev/stdout" \
-e "KONG_PROXY_ERROR_LOG=/dev/stderr" \
-e "KONG_ADMIN_ERROR_LOG=/dev/stderr" \
-p 127.0.0.1:8000:8000 \
-p 8443:8443 \
kong:3.6.1
Testing
Testing on VM
SSH on to the CUDO VM and run:
curl --header "apikey: my-key" -v http://localhost:8000/ollama
Swap /ollama
for the path defined in the yaml file. You should see the expected output from your API.
Testing remotely
To test that port 8443 is open and running, from your local machine run:
curl --insecure --header "apikey: my-key" -v https://CUDO-IP-ADDRESS:8443/ollama
Testing with SSL and python
As the certificate is self-signed we need to copy it to our local machine and use it in our request.
scp [email protected]:/root/kong/kong.crt .
import requests
r = requests.get('https://CUDO-IP-ADDRESS:8443/ollama', headers={'apikey': 'my-key'}, verify='kong.crt')
print(r, r.text)