Data Engineering

Data Engineering

Up next

It's RAG time: Retrieval-Augmented Generation

Today we are going to talk about the feature with the worst acronym in generative AI: RAG, or Retrieval Augmented Generation. If you've ever used something like "Chat with My Docs," if you have an internal AI chatbot that has access to your company's documents, or you've created ...  Show more

Chasing Away Repetitive LLM Responses with Verbalized Sampling

One of the things that LLMs can be really helpful with is brainstorming or generating new creative content. They are called Generative AI, after all—not just for summarization and question-and-answer tasks. But if you use LLMs for creative generation, you may find that their outp ...  Show more

Recommended Episodes

Moving Machine Learning Into The Data Pipeline at Cherre
Data Engineering Podcast

<div class="wp-block-jetpack-markdown"><h2>Summary</h2>

Most of the time when you think about a data pipeline or ETL job what comes to mind is a purely mechanistic progression of functions that move data from point A to point B. Sometimes, however, one of those transformation ...

  Show more

#162 Scaling Data Engineering in Retail with Mohammad Sabah, SVP of Engineering & Data at Thrive Market
DataFramed

Poor data engineering is like building a shaky foundation for a house—it leads to unreliable information, wasted time and money, and even legal problems, making everything less dependable and more troublesome in our digital world. In the retail industry specifically, data enginee ...  Show more

Unpacking The Seven Principles Of Modern Data Pipelines
Data Engineering Podcast

<h2>Summary</h2>

Data pipelines are the core of every data product, ML model, and business intelligence dashboard. If you're not careful you will end up spending all of your time on maintenance and fire-fighting. The folks at Rivery distilled the seven principles of mod ...

  Show more

Data Quality Starts At The Source
Data Engineering Podcast

<div class="wp-block-jetpack-markdown"><h2>Summary</h2>

The most important gauge of success for a data platform is the level of trust in the accuracy of the information that it provides. In order to build and maintain that trust it is necessary to invest in defining, monitori ...

  Show more