How I made my own AI System Admin Assist ...

How I made my own AI System Admin Assistant

Nov 25, 2024



I've been seeing a lot of these AI tools being used as additional assistants to their work. Some would take it far as to write and make their apps for them, others would be their "RAG AI Chatbot". Here's how I made mine. You can also copy from my Github repository: https://github.com/Qoyyuum/mychatbot/blob/main/docker-compose.yml

services:
  openWebUI:
    image: ghcr.io/open-webui/open-webui:main
    container_name: mychatbot_openwebui
    hostname: openwebui
    restart: unless-stopped
    volumes:
      - ./open-webui-local:/app/backend/data:z
    ports:
      - "3010:8080" # Feel free to change the port 3010 to something else
    environment:
      OLLAMA_BASE_URLS: http://ollama:11434

  ollama:
    image: ollama/ollama:latest # Use tag :rocm if you're using AMD GPU
    container_name: mychatbot_ollama
    hostname: ollama
    ports:
      - "11434:11434"
    volumes:
      - ./ollama-local:/root/.ollama:z

I had this running in a local home server I have. The home server specs are pretty bare:

  • 16GB RAM

  • Intel i7-3770

  • No GPU

  • Fedora Server 40

So this makes a fun little experiment to see how much the server can handle. I ran podman compose -f docker-compose.yml up -d --build

It takes a long while for it to build and pull ollama and the open-webui images. Being patient, it finally started:

Navigated to http://localhost:3010 and set up my Admin account. Then I followed the tutorial docs to make my own RAG AI Chatbot here: https://docs.openwebui.com/tutorials/tips/rag-tutorial

It was pretty straightforward. So I gave it my system docs of the system I'm maintaining. I also configured and downloaded llama3.1 as the base model. Here's the workspace settings I've used (feel free to use the following as a benchmark):

The Model

The model I decided to use was the multi-agent one from https://openwebui.com/m/stewart/multi-agent:latest

Knowing the limitations of my tiny home local server specs, I decided to time it and see how long it takes to send a response.

The Query

For a test, I thought of giving it a more recent question, a problem that I have solved. Just so I can test and benchmark how quick and accurate it can answer a problem if it presents itself. And it is as follows:

I have a problem with the CMC alert. Please advise a fix for the following error message:

Critical conditions::
Critical hardware component failure.
C2F power module is not operating.

Warning conditions::
More spare drives are expected.

This is in relevant to a Dell CMC hardware equipment. Typically when this happens, we would provide diagnostics report to send to Dell support as a new case ticket, that would eventually be a part replacement to replace one of the 2 CMC switches. So I would expect it to provide me at least the command to run diagnostics in its response.

The Response

Sent the message at 3:03pm. Started seeing responses came in, slowly, with each word spelled out every 2 seconds, at 3:23pm.

While it was responding, I noticed the server metrics was running HOT 🔥🔥🔥🔥! It used up 4 cores of the 8 core CPUs it has to 100% usage.

And the answer wasn't so good either.

Based on the provided context, it seems like you have two separate user queries, both related to issues with the CMC (Chassis Management Controller) alert and Puppet-controlled hosts.

For the first query about the CMC alert error message:

According to the "dell-chassis-mgmt-cntrllr-v4.3Reference Guideen-us.pdf" source [dell-chassis-mgmt-cntrllr-v4.3Reference Guideen-us.pdf], it appears that you're experiencing a critical hardware component failure with the C2F power module not operating.

However, without specific information on how to address this issue directly in the provided context or through my own knowledge base, I'm unable to offer a precise fix for this problem.

So I decided not to run it or ask it again (who knows? It might burst into flames the next time I ask it a question).

I guess I really need to give it a GPU. Then I would have to change the ollama docker to use the nvidia like so:

# To be tested once I have a Nvidia GPU
services:
  openWebUI:
    image: ghcr.io/open-webui/open-webui:main
    container_name: mychatbot_openwebui
    hostname: openwebui
    restart: unless-stopped
    volumes:
      - ./open-webui-local:/app/backend/data:z
    ports:
      - "3010:8080" # Feel free to change the port 3010 to something else
    environment:
      OLLAMA_BASE_URLS: http://ollama:11434

  ollama:
    image: ollama/ollama:latest # Use tag :rocm if you're using AMD GPU
    container_name: mychatbot_ollama
    hostname: ollama
    ports:
      - "11434:11434"
    volumes:
      - ./ollama-local:/root/.ollama:z    
    deploy:
      resources:
        reservations:
          devices:
          - driver: nvidia
            capabilities: ["gpu"]
            count: all


So for now, I will have to put this chatbot aside until I have the money to buy a Nvidia GPU card and install it. Thanks for reading! This experience has been a blast!

¿Te gusta esta publicación?

Comprar Qoyyuum un café

Más de Qoyyuum