GenAI: Reducing up to 85% costs with routing and moderation with Llama 3.1 405B

In this project I will create a python class that will take advantage of filtering to reduce costs, on top of that I will add a moderation layer using semantic router. Github Repository Description As explained before, we want to reduce costs in our AI generation, and to do that we will use filtering, but what exactly is this? When we have a GenAI in production, it is common to have a common occurrence, we are using a mighty LLM, and that costs us money, so if the user asks a question that is simple, in other words, that it does not require so much power, we might prefer to use a smaller and therefore, cheaper, LLM to answer....

July 31, 2024 · 13 min · 2626 words · Jesús Manuel Remón González

QLORA Finetuning from scratch for Mistral 7B + Perplexity

In this project I will perform a finetuning from scratch on Mistral 7B, particularly a QLORA, then compare the finetuned version and the base one making an example generation; and finally run the perplexity test to see the improvement after the training. Github Repository In a previous article, I performed a finetuning using a third-party provider, in this case, I will build the process from scratch so this code can be easily applied to any other dataset or model....

July 14, 2024 · 10 min · 1961 words · Jesús Manuel Remón González

Data Analysis: Quantum Vs Classical Computing

In this project I will compare the accuracy of a classifier algorithm, using a quantum computer vs a classical computer. In the output, we will compare the top features selected by the quantum computer and the ones chosen by the classical one. To be able to code all in Python in an easy way I will use the Qiskit package. Github Repository Description As mentioned before, we are going to compare two machines performing a classification task, the algorithm that we will put to the test is called SVC (Support Vector Classifier), which is an implementation of an SVM (Support Vector Machine) for classification tasks....

July 9, 2024 · 8 min · 1633 words · Jesús Manuel Remón González

Finetuned LLM for text-to-SQL responses

In this project I will develop a chatbot to generate SQL answers, combined with this we will perform a finetuning with the idea of reducing the token consumption and also improve the results quality. We can make all of this in a simpler way by using a platform such as Cohere or OpenAI, but in my case I will use Lamini, because it gives us the option of finetuning other LLMs such as Llama 3....

June 23, 2024 · 9 min · 1792 words · Jesús Manuel Remón González

Chatbot for research with Semantic Router

This project is a chatbot that leverages the Cohere LLM to answer research questions by searching online for answers. It can provide text-based citations and links to sources for verification. The Semantic Router is used to manage responses, ensuring confidentiality and appropriate user behavior. Project Demo Github Repository Disclaimer This project is intended for educational purposes and to showcase a chatbot’s capabilities. The results generated may be inaccurate and should not be considered reliable....

May 14, 2024 · 15 min · 3009 words · Jesús Manuel Remón González