In Plain English

Scaling Llama 2 (7 - 70B) Fine-tuning on Multi-Node GPUs with Ray on Databricks

Scaling up fine-tuning and batch inferencing of LLMs such as Llama 2 (including 7B, 13B, and 70B variants) across multiple nodes without having to worry about the complexity of distributed systems.

machine-learningllama2databricks

Super Quick: Retrieval Augmented Generation Using Ollama

Unlocking the Power of Ollama Infrastructure for Local Execution of Open Source Models and Interacting with PDFs

ollamallama2llm

Unleashing LLaMA’s Power: Crafting a Flask API to Seamlessly Load and Engage with Language Models

A guide to mplementing a Flask API for loading Llama models.

machine-learningllmslarge-language-models

Main Menu

Follow Us

Llama2

Scaling Llama 2 (7 - 70B) Fine-tuning on Multi-Node GPUs with Ray on Databricks

Super Quick: Retrieval Augmented Generation Using Ollama

Unleashing LLaMA’s Power: Crafting a Flask API to Seamlessly Load and Engage with Language Models