3 Easy Fixes for Everyday Chatbot Headaches
They're not revolutionary, but they might come in handy.
Look, chatbots are pretty great.
Without them, we’d have a hard time coming up with repetitive lists of exactly three things or counting the r’s in “strawberry.”
But now and then, I run into minor annoyances when working with them.
So I’ve developed a few simple workarounds that do the trick for me.
If you’re in the same boat, then get the hell out of my boat, pal! It’s a one-person inflatable, and you’re gonna drown us both. Jesus!
Anyway, maybe you’ll find these little quality-of-life tricks helpful.
1. Get context-aware draft analysis
I often use chatbots as beta readers for my writing.
The headache:
I draft my newsletter posts directly in the Substack editor.
I like to know exactly how the final post will appear, including embedded YouTube links, images with captions, and so on.
But to get feedback from a chatbot, I previously used to copy-paste the draft, which stripped away everything but text. While this mostly worked, stuff like headings, formatting, tables, visuals, etc. got lost in translation.
So I had to skip over any formatting feedback or pre-emptively ask chatbots to “Just focus on the text. I’ve already taken care of the formatting. The formatting is fine. I just can’t paste it here. Don’t talk to me about the formatting. Trust me, bro, it looks nice on the page, okay?”
The fix:
Two words: “Ctrl+P”!
Wait, is that three words? Dammit, Daniel. Do better!
But seriously, whenever I want feedback on any work in progress, I simply hit Ctrl+P and then “Save as PDF”:
I download the entire post as a PDF document, then throw it into my chatbot of choice for feedback.1 This has a bunch of benefits:
It keeps the formatting and visual layout, giving the chatbot better context.
You get a reusable file for any LLM instead of copy-pasting.
The process is entirely agnostic to your content editor or writing software. No matter which page or screen you’re on, Ctrl+P lets you save the entire context.
Alternative: If your post is relatively short, you can also save a full scrolling-page screenshot using a free extension like GoFullPage. This way, chatbots can use their vision capabilities to parse the screen as an image while also reading the text. But this approach deteriorates quickly for longer documents, because image compression makes text and other context harder for chatbots to parse.
Bonus tip: I strongly recommend Gemini (especially in Google AI Studio) for this. While ChatGPT and Claude can process PDFs, they struggle to “see” inside the embedded images.2 In my experience, Gemini can reliably parse images and get the full context.
2. Request copy-pasteable output
You can get a lot of work done inside the chatbot interface, but sometimes you’ll want to copy a model’s output to use elsewhere.
The headache:
While there’s usually a handy “copy” button, this button saves the chatbot’s entire response in addition to the thing you actually need.
So after pasting the output into an external tool, you’ll have to clean up ChatGPT’s chatter, from “Sure, I can help you with this task by first talking about how I’m going to help you with this task before actually helping you with the task” to “Do you also want me to spin up a quick alternative approach to the task or analyze the task in another way? Just say the word!”
The fix:
Add this line at the end of your request:
Return as a code block.
This separates the output from any chatbot preambles/postambles and gives you a handy “Copy code” button. Works in just about every chatbot.
ChatGPT:
Claude:
Gemini:
Grok:
You get the picture.
Most of them also end up using markdown by default3, which is quite handy when pasting content into tools that work well with structured outputs.
3. Turn voice ramblings into structured outlines
All major chatbots can talk to you now, and using voice instead of typing sure sounds promising!
The headache:
Voice chats are great for quick back-and-forth exchanges, but when it comes to more complex tasks, they fall apart for one big reason: They speak every. Word. They. Type. Out. Loud.
My ideal use case for voice chat is to be able to fire a stream of disorganized ideas into the microphone and have the chatbot wrangle some semblance of structure out of those.
But if I want a chatbot to do a deep dive and ask me a bunch of follow-up questions, I have to sit patiently as it speaks out the entire list before I can move on.
Worse still, if I request any form of detailed output (e.g. an organized outline or a deep analysis), I once again have to let it talk lest my interruption cuts off its response mid-sentence.
I tried many approaches, and I’ve yet to find a way to get a chatbot to silently return longer outputs in the background while only saying “Done” to confirm when it’s finished.4
Of course, there’s always the “voice dictation” feature that uses speech-to-text to turn your long voice tirade into a block of text to send to the chatbot. I often use it for quick “Do this” tasks, but it’s not optimal for deeper chats, because it removes the critical interaction element.5
The fix:
This is more of a process tweak than a hidden feature or shortcut.
Simply put, I split the interaction into two phases:
Unstructured voice chat (aka “context dump”)
Text-based organization
Phase one:
During this phase, I have a freeform voice chat about my task. I’ll often prime the chatbot for quicker back-and-forth interaction and bake in the “Ask me questions” approach. So I’ll present my goal and add something like this:
Interview me about this task/request/etc. Ask questions one by one until there is nothing left to cover or I say we’re done.
This turns the chatbot into an efficient interviewer that picks my brain without causing cognitive overload with long spoken-word essays.
When I’m done talking, I simply end the voice chat. I can now go about my day or head straight to my laptop for phase two.
Phase two:
All voice chats are saved as text transcripts, so I have the entire history ready in the context window. Now, I can pick up right where I left off and ask for structured output in text form without having to listen to it being read out loud.
This way, I get the best of both worlds by having breezy voice chats and structured long-form text output.
🫵 Over to you…
Have you also run into these limitations? Do you have better ways to work around them? What other chatbot inconveniences have you noticed and how do you deal with them?
Leave a comment or drop me a line at whytryai@substack.com.
Thanks for reading!
If you enjoy my writing, here’s how you can help:
❤️Like this post if it resonates with you.
🔄Share it to help others discover this newsletter.
🗣️Comment below—I love hearing your opinions.
Why Try AI is a passion project, and I’m grateful to those who help keep it going. If you’d like to support my work and unlock cool perks, consider a paid subscription:
Every major platform, from ChatGPT to Claude to Gemini, lets you upload PDFs for analysis.
I find that you can prompt ChatGPT into downloading the embedded images and running OCR and other checks to parse their content, but this process is hit or miss and engages the much slower reasoning model, so it’s rarely worth it.
…and you can easily get Claude to do the same by simply adding “use markdown” to your prompt.
If you found a way to make voice chats work like this, I’d love to hear about it!
Also, there’s the non-trivial risk of misclicking and watching your entire voice note disappear instead of being sent to the chatbot.