Practical session: introduction to LLMs

This is the first practical session accompanying the LLM introduction course at the IDESSAI 2024 summer school.

Contact: christophe.cerisara@loria.fr

Choice of library

There exist several libraries to manipulate LLM. The one you shall choose mainly depends on two factors: your target task, and your available hardware. You can find below a small selection of libraries grouped by task. Of course, there exist several other powerful libraries for LLMs, but these are the most common ones as of August 2024. Thereafter, we will focus on the ollama library.

Inference libraries

Finetuning libraries

Pretraining libraries

Practical session: ollama

ollama is designed to be easy to use locally to try out open source LLMs. with just one or a few commands, it downloads a quantized LLM locally and launch an OpenAI-compatible server, which you may interact with using one of the many of the available chatGPT-compatible clients. (Personal note: my preferred client is quite geeky, pure linux command-line: charm.sh mods). Ollama also gives simple command-line scripts to immediatly start chatting with the LLM, without any server. As of August 2024, it’s one of the preferred way to quickly start using an LLM.

Q1: Use Llama3.1 locally with ollama

ollama run llama3.1
ollama run gemma2:2b

Q2: Use Llama3.1 to interact with tools

Any LLM is limited to the knowledge it has been trained on (and so its date cutoff), and to only interact through text. A major trend mid-2024 is to let LLMs interact with external tools, such as a calculator, a web search engine, a python script execution sandbox… The underlying principle is to finetune the LLM to generate a special structured text format, where the LLM writes the ID of some external tool and its arguments. Then, the program that is calling the LLM can interpret this structured text format and execute the call to the actual external tool specified. Then, we can continue our converation with the LLM by feeding it with the answer from the tool.

One important missing part: before doing all this, you must give ollama a list of available external tools that can be used. This is done by installing the ollama pip library, which enables you tou call ollama in python and define one python method for each tool.

Important: When listing the tools/python methods for ollama, it’s important to clearly describe what each method is doing as well as each of its arguments in plain English, as the LLM will decide to call a given tool based on its description!


Let’s now put in into practice:

import ollama
import requests
import json

messages = [{'role': 'user', 'content': 'What is the main news right now in the USA?'}]

def getnews(c):
    c=c.lower().strip()
    if c=='france': s='fr'
    elif c=='india': s='in'
    elif c=='usa': s='us'
    elif c=='australia': s='au'
    elif c=='russia': s='ru'
    elif c=='united kingdom': s='gb'
    else:
        print("unknown country",c)
        s='fr'
    url="https://saurav.tech/NewsAPI/top-headlines/category/general/"+s+".json"
    print("calling fct")
    response = requests.get(url)
    res = response.text
    print("tool res",res)
    print("\n"*5)

    n=json.loads(res)
    r=n['articles'][0]['title']+": "+n['articles'][0]['content']
    print("extracting news",r,"\n"*3)
    return r

def main():
    response = ollama.chat(
        model='llama3.1',
        messages=messages,
        tools=[
          {
            'type': 'function',
            'function': {
              'name': 'getnews',
              'description': 'Get recent news from a country',
              'parameters': {
                'type': 'object',
                'properties': {
                    'country': {
                        'type': 'string',
                        'description': 'The name of the country',
                        },
                },
                'required': ['country'],
              },
            },
          },
        ],
    )

    # Add the model's response to the conversation history
    messages.append(response['message'])
    print("first answer",response['message'])

    # Check if the model decided to use the provided function
    if not response['message'].get('tool_calls'):
        print("The model didn't use the function. Its response was:")
        print(response['message']['content'])
        return

    # Process function calls made by the model
    if response['message'].get('tool_calls'):
        available_functions = {
          'getnews': getnews,
        }
        for tool in response['message']['tool_calls']:
          function_to_call = available_functions[tool['function']['name']]
          function_response = function_to_call(tool['function']['arguments']['country'])
          # Add function response to the conversation
          messages.append(
            {
              'role': 'tool',
              'content': function_response,
            }
          )

    # Second API call: Get final response from the model
    final_response = ollama.chat(model='llama3.1', messages=messages)
    print(final_response['message']['content'])

main()

# adapted from https://github.com/ollama/ollama-python/blob/main/examples/tools/main.py