Data synthesis for SOTA LLMs

Update: 2024-02-06

Description

Nous Research has been pumping out some of the best open access LLMs using SOTA data synthesis techniques. Their Hermes family of models is incredibly popular! In this episode, Karan from Nous talks about the origins of Nous as a distributed collective of LLM researchers. We also get into fine-tuning strategies and why data synthesis works so well.

Leave us a comment

Changelog++ members save 2 minutes on this episode because they made the ads disappear. Join today!

Sponsors:

Read Write Own – Read, Write, Own: Building the Next Era of the Internet—a new book from entrepreneur and investor Chris Dixon—explores one possible solution to the internet’s authenticity problem: Blockchains. From AI that tracks its source material to generative programs that compensate—rather than cannibalize—creators. It’s a call to action for a more open, transparent, and democratic internet. One that opens the black box of AI, tracks the origins we see online, and much more. Order your copy of Read, Write, Own today at readwriteown.com
Fly.io – The home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.

Featuring:

Karan Malhotra – LinkedIn
Chris Benson – Twitter, GitHub, LinkedIn, Website
Daniel Whitenack – Twitter, GitHub, Website

Show Notes:

Nous on Hugging Face

Nous Research

Something missing or broken? PRs welcome!

Comments

In Channel

Private, open source chat UIs

2024-04-3038:27

Mamba & Jamba

2024-04-2441:15

Udio & the age of multi-modal AI

2024-04-1638:54

RAG continues to rise

2024-04-1048:21

Should kids still learn to code?

2024-04-0239:22

AI vs software devs

2024-03-2657:02

Prompting the future

2024-03-2046:01

Generating the future of art & entertainment

2024-03-1242:08

YOLOv9: Computer vision is alive and well

2024-03-0642:48

Representation Engineering (Activation Hacking)

2024-02-2843:36

Leading the charge on AI in National Security

2024-02-2052:05

Gemini vs OpenAI

2024-02-1443:31

Data synthesis for SOTA LLMs

2024-02-0646:41

Large Action Models (LAMs) & Rabbits 🐇

2024-01-3048:15

Collaboration & evaluation for LLM apps

2024-01-2346:16

Advent of GenAI Hackathon recap

2024-01-1747:52

AI predictions for 2024

2024-01-1045:00

Open source, on-disk vector search with LanceDB

2023-12-1941:53

The state of open source AI

2023-12-1242:37

Suspicion machines ⚙️

2023-12-0546:57

Download from Google Play

Download from App Store

FAQs

United States

00:00

1.0x

Data synthesis for SOTA LLMs

#box-pro-ellipsis-171490870981856{-webkit-line-clamp:2;}Data synthesis for SOTA LLMs

Data synthesis for SOTA LLMs

Changelog Media

Data synthesis for SOTA LLMs