Resources
Tutorials

How to train your GPT (Part 2)

Uploading your custom documents to give your GPT a knowledge base

By
Fawzi Ammache
January 19, 2024

One of the powers of building custom GPTs is uploading your own documents to give your GPT a unique “brain”, or knowledge base.

In Part 1 of this series, I covered how I integrated my newsletter’s content (75+ articles) into my GPT’s knowledge base. If you missed it, you can read it at the link below.

Many of you sent me follow-up questions about uploading private documents directly into the GPT’s knowledge base, as opposed to using the web browsing feature to access content that already exists publicly online.

So, we’ll cover that in Part 2 today!

Choose your documents wisely

Under the Configure tab, you can upload files in the Knowledge section. You should know that you have a maximum limit of 10 documents to upload.

The biggest mistake you can make is uploading documents all willy-nilly into your GPT.

Before you know it, your GPT’s sifting through documents like he’s Harry Potter swarmed by Hogwarts letters, just to answer a simple question. So make sure to define specific use cases and only upload knowledge that’s relevant for your GPT’s tasks.

🚨 Beware:

People using your GPT may be able to download your documents if Code Interpreter is enabled in the Capabilities section (right below Knowledge). Make sure to turn it off if you don’t want anyone accessing the full documents you’ve uploaded.

File formats

You can upload any file format into your GPT:

  • Text documents: TXT or Word
  • Spreadsheets
  • Presentations
  • PDFs

Regardless of the format, make sure you upload a clean and readable document that a computer can easily parse. My assumption is that GPTs use RAG (retrieval augmented generation) to store knowledge.

For that reason, try to upload documents with simple, one-column layouts that can be easily parsed, cut up, and stored. If your data is all text, your safest bet is a TXT file. Beware of documents with complex layouts (like PDFs with two columns of content), as it may chunk up and store the content inacurrately.

If you rewatch Sam Altman’s live demo of building a GPT with custom documents, you’ll notice he used a TXT file containing his lecture notes. So if you have complex documents, it’s worth spending a bit of time to reformatting them to ensure good knowledge transfer.

But don’t worry, I built a custom GPT you can use to simplify your documents and convert them into a TXT format (how meta, I know!). I’ve used my GPT-Friendly Document Maker to convert a few PDFs and it works like a charm. Plus, it significantly reduces the file size!

Try the GPT-Friendly Document Maker

Recommended instructions

My initial attempts at integrating my private documents into my GPT were extremely frustrating. It kept ignoring the documents I uploaded and instead relied on answering questions based on GPT-4’s general knowledge.

I landed on an instruction that seems to be working for now, and you’ll find this useful if you want your GPT to only answer questions based on the documents you uploaded:

Prompt:

This GPT should always search its knowledge base before answering

I noticed that OpenAI was using those exact words when a GPT was referencing its knowledge base, so I decided to reuse them in my instruction, which has helped minimize drifting into GPT-4’s general knowledge and focus more on the documents I uploaded.

Based on your use case, you may want to be even more specific with that prompt. If you uploaded multiple documents, you may need to specify which document to reference for specific questions.

For example: you may have built a GPT to onboard new employees. The GPT should reference the FAQ document for general questions, but the Company Calendar for questions relating to paydays and holidays.

Fawzi Ammache
Founder, Year 2049

Become an AI Pro

An email a week with the AI knowledge you seek.

Never miss our latest insights, tutorials, and case studies into the fascinating world of AI with our weekly newsletter.

Unsubscribe anytime. By registering you agree to Substack's Terms of Service, Privacy Policy, and Information Collection Notice
Thank you! Your submission has been received!
Oops! Something went wrong. Please try again.