Llama 2: Installation to Interaction Guide

Akash

Engineering Lead

5 min read

Tags:

Generative-AI MetaAI Llama2 ctransformers Prompt Engineering

Introduction

The race to create robust Generative Large Language Models (LLMs) has been heating up with the release of GPT from OpenAI. Companies are now competing to develop their own LLMs, which can be a cumbersome process involving thorough research and numerous trials and errors. One of the key challenges in developing LLMs is curating high-quality datasets, as the effectiveness of these models heavily depends on the data they are trained on.

In this blog, we will explore Llama, a Generative AI model developed by Meta AI, a company owned by Meta (formerly Facebook). We will discuss the features and capabilities of Llama 2, the latest version of the model. We will also explain how researchers can access the Llama 2 model weights for non-commercial uses.

Llama: A Generative AI Model

Llama (Large Language Model Meta AI) is a Generative AI model developed by Meta AI. The model was announced in February 2023, and it represents a group of foundational LLMs developed by the company. With the introduction of Llama, Meta has entered the LLM space and is now competing with OpenAI's GPT and Google's PaLM models.

One of the unique features of Llama is that it is completely open-source and free for anyone to use. Meta AI has released the Llama weights for researchers for non-commercial uses, which is not the case with other LLMs like GPT and PaLM. This move by Meta AI has opened up new possibilities for researchers and developers who can now access and work with the Llama model weights without any restrictions.

Llama 2: A Step Forward

Llama 2 is the latest version of the Llama model, which surpasses the previous version, Llama version 1, in terms of performance and capabilities. Llama 2 was trained on 2 trillion pre-training tokens, which is a significant improvement over the previous version. The context length for all the Llama 2 models is 4k, which is twice the context length of Llama 1.

Llama 2 has achieved the highest score on Hugging Face, outperforming state-of-the-art open-source models such as Falcon and MPT in various benchmarks, including MMLU, TriviaQA, Natural Question, HumanEval, and others. The comprehensive benchmark scores for Llama 2 can be found on Meta AI's website.

Furthermore, Llama 2 has undergone fine-tuning for chat-related use cases, involving training with over 1 million human annotations. These chat models are readily available to use on the Hugging Face website.

Access to Llama 2

The source code for Llama 2 is available on GitHub, which means that researchers and developers can access and modify the code for non-commercial uses. However, to access the original weights of Llama 2, users need to provide their name and email address on the Meta AI website.To download the model weights, users need to click on accept and continue after providing their name, email address, and organization (student if you are not working). Once the email is verified, users can access the model weights and start working with them.

Working with Llama 2

Now that we have discussed the features and capabilities of Llama 2, let's explore how researchers and developers can work with this model using Hugging Face, Langchain, and Ctransformers.

The first step is to install Hugging Face, Langchain, and Ctransformers using the following command:

                        
 !pip install --quiet huggingface_hub langchain ctransformers

Once the installation is complete, we can download the Llama 2 model weights using the following code block:

                        
 from huggingface_hub import hf_hub_download
 from llama_cpp import Llama

 model_name_or_path = "TheBloke/Llama-2-7B-Chat-GGML"
 model_basename = "llama-2-7b-chat.ggmlv3.q5_0.bin" # the model is in bin format

 model_path = hf_hub_download(repo_id=model_name_or_path, filename=model_basename)

In this code block, we are using the Hugging Face hub to download the Llama 2 model weights. We have specified the model name and path as "TheBloke/Llama-2-7B-Chat-GGML" and the filename as "llama-2-7b-chat.ggmlv3.q5_0.bin". The model is in bin format, and we have used the hf_hub_download function to download the model weights.

Next, we can use the following code block to load the Llama 2 model using Langchain and Ctransformers:

                        
 from langchain.llms import CTransformers
 llm = CTransformers(
         model = model_path,
         model_type="llama",
         max_new_tokens = 512,
         temperature = 0.5
     )

Next we can add a custom prompt to receive output as desired.

                        
 from langchain import PromptTemplate

 B_INST, E_INST = "[INST]", "[/INST]"
 B_SYS, E_SYS = "<>\n", "\n<>\n\n"

 DEFAULT_SYSTEM_PROMPT="""\
 Always answer as helpfully as possible, while being safe. Your answers 
 should not include any harmful, unethical, racist, sexist, toxic, 
 dangerous, or illegal content. Please ensure that your responses are 
 socially unbiased and positive in nature.

 If a question does not make any sense, or is not factually coherent, 
 explain why instead of answering something not correct. If you don't 
 know the answer to a question, please don't share false information."""

 instruction = "Convert the following text from English to French: \n\n 
 {text}"

 SYSTEM_PROMPT = B_SYS + DEFAULT_SYSTEM_PROMPT + E_SYS

 template = B_INST + SYSTEM_PROMPT + instruction + E_INST

 prompt = PromptTemplate(template=template, input_variables=["text"])

Query over the model using prompts

                        
 from langchain import LLMChain


 LLM_Chain=LLMChain(prompt=prompt, llm=llm)
 print(LLM_Chain.run("My name is Akash Kumar"))

In French, you can use the phrase "Je m'appelle Akash Kumar" which means "My name is Akash Kumar." However, it's important to note that the name "Akash" is not a common French name, and "Kumar" is not a typical French surname. The name "Akash Kumar" is an English name and does not have a direct equivalent in French. If you have any other questions or requests, please feel free to ask!

To chat with llama 2 model visit their official playground website - https://www.llama2.ai/

Source Code

Conclusion

Meta AI's LlaMa 2 represents a significant leap forward in the world of Generative Large Language Models. With its open-source philosophy, remarkable training data scale, and exceptional performance in benchmark evaluations, LlaMa 2 is poised to play a pivotal role in the development of AI-driven applications across various domains. As the AI landscape continues to evolve, initiatives like LlaMa 2 demonstrate the potential for democratizing AI and fostering innovation on a global scale. Researchers, developers, and AI enthusiasts now have a powerful tool at their disposal, thanks to Meta AI's commitment to advancing the field of Generative AI.

Back To Blogs

Akash Kumar Pavadashetti

Engineering Lead

Senior Full Stack Engineer with experience in designing architecture and schema of an application. Well versed with Python, Django, React and AWS services. A good communicator who takes ownership in what he does.

Python Scripting Django WxPython React Redux Javascript MySQL Redis Gunicorn

AutoGPT, the new disruptive kid on the AI block!

Welcome to the world of AutoGPT, the new disruptive kid on the AI block! This revolutionary tool...

Akash

2023-08-08

Chrome Extension

9 min read

Be Responsive

Be Responsive is a chrome extension that helps developers to build their website faster by displaying...

Shambhu

2023-01-18

PostgreSQL

6 min read

Evolution of JSONB - PostgreSQL

The evolution of JSONB in PostgreSQL has allowed for more efficient storage and querying of JSON data...

AutoGPT, the new disruptive kid on the AI block!

Akash

2023-08-08

Be Responsive

Shambhu

2023-01-18

Evolution of JSONB - PostgreSQL

Akash

2023-01-16

Exploring Llama 2: From Installation to Interaction

Akash

Engineering Lead

Introduction

Llama: A Generative AI Model

Llama 2: A Step Forward

Access to Llama 2

Working with Llama 2

Conclusion

Akash Kumar Pavadashetti

Engineering Lead

Akash Engineering Lead

Introduction

Llama: A Generative AI Model

Llama 2: A Step Forward

Access to Llama 2

Working with Llama 2

Conclusion

Akash Kumar Pavadashetti

Engineering Lead

AutoGPT, the new disruptive kid on the AI block!

Akash 2023-08-08

Be Responsive

Shambhu 2023-01-18

Evolution of JSONB - PostgreSQL

Akash 2023-01-16

Akash

Engineering Lead

Akash

2023-08-08

Shambhu

2023-01-18

Akash

2023-01-16