QLORA Finetuning from scratch for Mistral 7B + Perplexity

In this project I will perform a finetuning from scratch on Mistral 7B, particularly a QLORA, then compare the finetuned version and the base one making an example generation; and finally run the perplexity test to see the improvement after the training. Github Repository In a previous article, I performed a finetuning using a third-party provider, in this case, I will build the process from scratch so this code can be easily applied to any other dataset or model....

July 14, 2024 · 10 min · 1961 words · Jesús Manuel Remón González

Finetuned LLM for text-to-SQL responses

In this project I will develop a chatbot to generate SQL answers, combined with this we will perform a finetuning with the idea of reducing the token consumption and also improve the results quality. We can make all of this in a simpler way by using a platform such as Cohere or OpenAI, but in my case I will use Lamini, because it gives us the option of finetuning other LLMs such as Llama 3....

June 23, 2024 · 9 min · 1792 words · Jesús Manuel Remón González

Chatbot for research with Semantic Router

This project is a chatbot that leverages the Cohere LLM to answer research questions by searching online for answers. It can provide text-based citations and links to sources for verification. The Semantic Router is used to manage responses, ensuring confidentiality and appropriate user behavior. Project Demo Github Repository Disclaimer This project is intended for educational purposes and to showcase a chatbot’s capabilities. The results generated may be inaccurate and should not be considered reliable....

May 14, 2024 · 15 min · 3009 words · Jesús Manuel Remón González

Vector Database for an LLM: Diabetes Information

In this Python project, we will download diabetes information from MedlinePlus, a reputable website from the United States government. In the multiple documents, there will be a variety of content around diabetes, like descriptions, symptoms, and related conditions. All of that data will be vectorized in a VectorDB, Pinecone, in order to perform a semantic search in the data. Disclaimer This project is intended for educational and exploratory purposes only. It aims to facilitate data analysis and to illustrate potential future applications of AI in medicine....

April 16, 2024 · 10 min · 2086 words · Jesús Manuel Remón González

HippocratesAI: Using GenAI to diagnose medical conditions

In this python project, I will create a platform with the idea of using AI to speed up medical diagnosis, helping doctors and patients to accelerate the medical servide, thus reducing costs and time. Disclaimer This project is intended for educational and exploratory purposes only. It aims to facilitate data analysis and to illustrate potential future applications of AI in medicine. Please note that I am not a medical professional, and the outputs generated by this project should not be considered as medical advice....

April 14, 2024 · 4 min · 800 words · Jesús Manuel Remón González