How to Build a Chatbot Using Streamlit and Llama 2 – MUO – MakeUseOf

Llama 2 is an open-source large language model (LLM) developed by Meta. It is a competent open-source large language model, arguably better than some closed models like GPT-3.5 and PaLM 2. It consists of three pre-trained and fine-tuned generative text model sizes, including the 7 billion, 13 billion, and 70 billion parameter models.

You will explore Llama 2s conversational capabilities by building a chatbot using Streamlit and Llama 2.

How different is Llama 2 from its predecessor large language model, Llama 1?

Llama 2 significantly outperforms its predecessor in all respects. These characteristics make it a potent tool for many applications, such as chatbots, virtual assistants, and natural language comprehension.

To start building your application, you have to set up a development environment. This is to isolate your project from the existing projects on your machine.

First, start by creating a virtual environment using the Pipenv library as follows:

Next, install the necessary libraries to build the chatbot.

Streamlit: It is an open-source web app framework that renders machine learning and data science applications quickly.

Replicate: It is a cloud platform that provides access to large open-source machine-learning models for deployment.

To get a Replicate token key, you must first register an account on Replicate using your GitHub account.

Once you have accessed the dashboard, navigate to the Explore button and search for Llama 2 chat to see the llama-270b-chat model.

Click on the llama-270b-chat model to view the Llama 2 API endpoints. Click the API button on the llama-270b-chat models navigation bar. On the right side of the page, click on the Python button. This will provide you with access to the API token for Python Applications.

Copy the REPLICATE_API_TOKEN and store it safe for future use.

First, create a Python file called llama_chatbot.py and an env file (.env). You will write your code in llama_chatbot.py and store your secret keys and API tokens in the .env file.

On the llama_chatbot.py file, import the libraries as follows.

Next, set the global variables of the llama-270b-chat model.

On the .env file, add the Replicate token and model endpoints in the following format:

Paste your Replicate token and save the .env file.

Create a pre-prompt to start the Llama 2 model depending on what task you want it to do. In this case, you want the model to act as an assistant.

Set up the page configuration for your chatbot as follows:

Write a function that initializes and sets up session state variables.

The function sets the essential variables like chat_dialogue, pre_prompt, llm, top_p, max_seq_len, and temperature in the session state. It also handles the selection of the Llama 2 model based on the user's choice.

Write a function to render the sidebar content of the Streamlit app.

The function displays the header and the setting variables of the Llama 2 chatbot for adjustments.

Write the function that renders the chat history in the main content area of the Streamlit app.

The function iterates through the chat_dialogue saved in the session state, displaying each message with the corresponding role (user or assistant).

Handle the user's input using the function below.

This function presents the user with an input field where they can enter their messages and questions. The message is added to the chat_dialogue in the session state with the user role once the user submits the message.

Write a function that generates responses from the Llama 2 model and displays them in the chat area.

The function creates a conversation history string that includes both user and assistant messages before calling the debounce_replicate_run function to obtain the assistant's response. It continually modifies the response in the UI to give a real-time chat experience.

Write the main function responsible for rendering the entire Streamlit app.

It calls all the defined functions to set up the session state, render the sidebar, chat history, handle user input, and generate assistant responses in a logical order.

Write a function to invoke the render_app function and start the application when the script is executed.

Now your application should be ready for execution.

Create a utils.py file in your project directory and add the function below:

The function performs a debounce mechanism to prevent frequent and excessive API queries from a users input.

Next, import the debounce response function into your llama_chatbot.py file as follows:

Now run the application:

Expected output:

The output shows a conversation between the model and a human.

Some real-world examples of Llama 2 applications include:

With closed models like GPT-3.5 and GPT-4, it is pretty difficult for small players to build anything of substance using LLMs since accessing the GPT model API can be quite expensive.

Opening up advanced large language models like Llama 2 to the developer community is just the beginning of a new era of AI. It will lead to more creative and innovative implementation of the models in real-world applications, leading to an accelerated race toward achieving Artificial Super Intelligence (ASI).

More here:

How to Build a Chatbot Using Streamlit and Llama 2 - MUO - MakeUseOf

Related Posts

Comments are closed.