I posted a few weeks about about my AI Traveler project, how I built some scripts and tools to completely 100% automate a basic Youtube Channel.
Itās been running automatically for about 2 weeks now and Iāve made lots of little changes and tweaks, and I wanted to share my findings for anyone else playing in this space.
Itās an interesting collection of AI & Technical quirks that sometimes disappoint, sometimes entertain.
Ā
Ā
Prompt Engineering
If youāve played any in the new LLM space youāve heard the term āPrompt Engineeringā. What is it? Wikipedia says:
Prompt engineering or prompting is the process of structuring sentences so that they can be interpreted and understood by a generative AI model in such a way that its output is in accord with the user's intentions.
So that doesnāt really help, does it? Let me give you a concrete example. In the first version of the script I used a prompt like:
Write about the following topic:{prompt}
. Write in short sentences separated by . Write about{nLen}
complete sentences.
Generally that worked. But there were a few main problems:
- Using a period as a separator only works if thatās not in your title. Like āSt. Peterās Basilicaā -
- Even when told short sentences, it can generate some really long run-on compound sentences which donāt work well in this use case.
- Sometimes it can be too literal and literally return āsentencesā. Like, the single word. š¤¦
Ā
It took some tinkering, but I eventually rewrote the prompt to return the result as a JSON array of proper sentences. That made things much more structured and easy to work with. However, the structure of the JSON would vary just a bit from run to run. Sometimes you get a basic array. Sometimes you get a key-valuearray pair. sometimes you get just the array without the bounding braces. It took a combination of Prompt-work and Python code to build something robust to work, but itās been a good week now without any failed executions.
Resolution
Iām still running this on a Raspberry Pi 3B+, which was capped at 720P resolutions. Itās my own fault, I grabbed one handy without looking too closely and it only had 1G of RAM. I switched to a 4G unit, and now it can generate 1080p videos.
- 55s short takes 45minutes to encode
- 3 minute video takes 2 hours to encode
Ā
Which leads into the next topic:
API Limits
Google approved my request for a Quota increase, so I reconfigured the tools:
- Generate a 55s short every 2 hours
- Generate a 3 minute video 3x/day
Ā
I also added a yake pass to the video description to generate hashtags for the video.
Ā
Results
All of these together have made a noticable improvement to the quality of videos. However, itās still far from perfect. Even some astute viewers have noticed comical items like this video:
A video of plants that only shows Bridges and buildings.
Iām still far from anything profitable on the channel. Even with 1000+ subscribers Iām under 5% of the viewer metrics required to even enable them. However, itās been a fun project and Iāve learned a lot about the capabilities of these systems.
With all of this running, my only real cost is the ChatGPT API usage, which comes to about $0.12/day, or less than $4/month .
If I keep working on it, I may try to replace some of the yake elements with ChatGPT. I would hope that can generate more relevant keywords instead of the current context-free system, but I would have to do some work to integrate that against my image search algorithms to handle query failures.
Ā