Running Fabric Locally with Ollama: A Step-by-Step Guide - Bernhard Knasmüller on Software Development

In the realm of Large Language Models (LLMs), Daniel Miessler’s fabric project is a popular choice for collecting and integrating various LLM prompts. However, its default requirement to access the OpenAI API can lead to unexpected costs. Enter ollama, an alternative solution that allows running LLMs locally on powerful hardware like Apple Silicon chips or dedicated GPUs. In this guide, we’ll explore how to modify fabric to work with ollama.

Step 1: Install Ollama

To begin, install ollama according to the official instructions at ollama.com/download.

Step 2: Pull a Model

Next, pull your preferred model using the command ollama pull <model_name>. I recommend mistral:instruct for this demonstration:

ollama pull mistral:instruct

To ensure that everything is set up correctly, test ollama by sending a local curl request:

curl http://localhost:11434/v1/chat/completions \
    -H "Content-Type: application/json" \
    -d '{
        "model": "mistral:instruct",
        "messages": [
            {
                "role": "system",
                "content": "You are a helpful assistant."
            },
            {
                "role": "user",
                "content": "Why is the sky blue?"
            }
        ]
    }'

A typical reply is expected to look like this:

{"id":"chatcmpl-355","object":"chat.completion","created":1708167599,"model":"mistral:instruct","system_fingerprint":"fp_ollama","choices":[{"index":0,"message":{"role":"assistant","content":" The color of the sky appears blue due to a process called Rayleigh scattering. As sunlight reaches Earth's atmosphere, it interacts with different gases and particles present in the air. Blue light has a shorter wavelength and gets scattered more easily than other colors, resulting in the scattered blue light being more visible to our eyes from all directions. Thus, the sky appears blue during a clear day."},"finish_reason":"stop"}],"usage":{"prompt_tokens":16,"completion_tokens":83,"total_tokens":99}}

Note that this curl request has the same format as the OpenAI API call but uses the locally running LLM under the hood.

If the curl request fails, verify that ollama is running and try invoking it via ollama serve if necessary.

Step 3: Set Up Fabric Locally

Now, let’s modify fabric to work locally using ollama. Follow these instructions to set up fabric: github.com/danielmiessler/fabric. Note that during the setup process, you will be prompted for a valid OpenAI API key; however, it won’t be used later for model invocations.

To make fabric work locally with ollama, modify the utils.py file in the installer/client/cli directory as follows:

diff --git a/installer/client/cli/utils.py b/installer/client/cli/utils.py
index 9f0e522..431db1e 100644
--- a/installer/client/cli/utils.py
+++ b/installer/client/cli/utils.py
@@ -35,9 +35,13 @@ class Standalone:
         env_file = os.path.expanduser(env_file)
         load_dotenv(env_file)
         try:
-            apikey = os.environ["OPENAI_API_KEY"]
-            self.client = OpenAI()
-            self.client.api_key = apikey
+            #apikey = os.environ["OPENAI_API_KEY"]
+            #self.client = OpenAI()
+            #self.client.api_key = apikey
+            self.client = OpenAI(
+                base_url = 'http://localhost:11434/v1',
+                api_key='ollama', # required, but unused
+            )
         except KeyError:
             print("OPENAI_API_KEY not found in environment variables.")

With this code change, you instruct the OpenAI client to use a different base URL (namely the local one).

Now that you have made these changes, invoke fabric using the local Mistral model (make sure to explicitly set the model name via --model):

curl --silent --show-error https://gist.githubusercontent.com/bknasmueller/deb128a189d11e709234dd7191c43733/raw/2d42d7d7ff99566657a44df0bd71a6b0b5ab29d0/signed_prompt_blog_post.md | fabric --pattern summarize --model mistral:instruct --stream

An example output might look like this:

# ONE SENTENCE SUMMARY:
The reviewed paper proposes a new method called "Signed-Prompt" to prevent prompt injection attacks against LLM-integrated applications by substituting original instructions with random keywords, creating a whitelist of approved commands for the model.

# MAIN POINTS:
1. Prompt injections are still an unsolved problem when deploying applications integrating LLMs.
2. Conventional defenses against injection attacks like SQL injection or XSS are not applicable to prompt injections.
3. The approach presented in the paper does not rely on input filtering or output encoding but signs authorized instructions instead.
4. Signing involves substituting commands with random keywords, making it difficult for attackers to provide signed commands to the model.
5. The author claims that his proposed mechanism effectively prevents prompt injections in 100% of investigated cases.
6. A limitation of this approach is the requirement to keep track of a whitelist of approved commands, which restricts the model's generic capabilities.
7. The paper does not provide exact data set details or source code for implementation and experimentation.
8. The Signed Prompts approach seems to be a useful step in the right direction against prompt injection attacks.
9. The author expects that this method could prevent all prompt injection attacks if properly implemented and tested.
10. Further research is needed to validate the effectiveness of the Signed Prompts approach and its impact on model capabilities.

# TAKEAWAYS:
1. Prompt injections are a significant issue for applications integrating LLMs, and conventional defenses do not provide adequate protection.
2. The Signed-Prompt approach offers a new way to prevent prompt injections by substituting original instructions with random keywords and maintaining a whitelist of approved commands.
3. While the effectiveness of this method is promising, it requires further research, implementation, and experimentation to validate its impact on model capabilities and overall security.
4. Keeping track of a whitelist of approved commands can be challenging but necessary for implementing the Signed-Prompt approach effectively.
5. The Signed-Prompt approach could potentially prevent all prompt injection attacks if properly implemented, tested, and maintained.

Notes

Please note that by implementing this change, fabric only works with local models and the OpenAI API is no longer usable. ~~If necessary, it would be easy to implement a model-dependent switch, but I expect this functionality will eventually find its way into fabric anyway.~~ Update: See below

Also note that the local models tend to be less capable than GPT-4, which is more visible in some patterns than in others (summarize works quite well for me, but extract_wisdom seems to struggle especially when using longer inputs).

Update February 18: Switch between local and OpenAI usage

Since the local models lack several capabilities, it comes very handy to be able to switch between local and “remote” (OpenAI API) use.

You can achieve this easily by changing the code to the following:

if "gpt-4" in args.model:
    apikey = os.environ["OPENAI_API_KEY"]
    self.client = OpenAI()
    self.client.api_key = apikey
else:
    self.client = OpenAI(
        base_url = 'http://localhost:11434/v1',
        api_key='ollama', # required, but unused
    )

This will cause fabric to use the OpenAI API for all models containing the string “gpt-4”.