Research

AI-Guided Meditation: Enhancing Mental Well-being Through Computer Science and Technology

Despite meditation's many benefits for the mind and body, it is only practiced by a small portion of people across the world. Can AI change this?
By Peter Nguyen
|
August 21, 2023
Hypnotism-like visuals for AI guided meditation
READ THE PAPER

My name is Peter Nguyen, and I am interested in improving health and education through computer science. I like pursuing this project because meditation has numerous benefits for our mental health and capability, but it is not widely practiced. Meditation practices come in many different forms and with different benefits. It can be used to improve focus, relieve stress, relax into sleep, regulate emotions, increase self-awareness, and even improve physical health. Despite its many benefits for the mind and body, meditation is only practiced by a small portion of people across the world. With proper guidance, however, more people could experience meditation and accept it into their regular routine.

If everyone adopted meditation on a large scale, it would play a major role in improving relationships and reducing conflict and violence across the world. This project aims to progress towards this idea by using AI to guide people into a meditative state.

Meditation Research

Through research, I learned about many different types of meditation and realized that the best type of meditation is specific to the user. I then wrote a specific prompt for each of five types of meditation I wanted my program to consider (i.e. body scan, focused, visualization, reflection, and movement). These prompts would then be used to ask text generation models to write a meditation script to guide the user through the practice. I also reasoned that involving all five senses would improve the user’s experience with meditation practice, so I fine-tuned some of the prompts account for this, such as the following: “Write me a visualization meditation script noticing all 5 senses at the beach and is designed to boost mood, reduce stress, and promote inner peace.”

Text Generation

I started with prompting ChatGPT to create sample scripts and fine tune the prompts that I wrote. The scripts were fantastic, but I moved towards searching for a free text generation model I could use to create scripts without relying on the limited API key. After testing some models, I settled with a model named tiiuae/falcon-7b-instruct from Hugging Face. Its generated scripts are shorter than those of ChatGPT, but the quality is high. I may try searching for better models in the future, but this is satisfactory for now.

Text-to-Speech

I experimented with two python libraries, gTTS and pyttsx3. Pyttsx3 only had two voice options, which both sounded very robotic. It allows you to customize pitch and speech rate, but I could only find a way to set them for the entire audio instead of changing them in certain parts. gTTS, on the other hand, had voices that sounded more natural. I chose the Indian accent one for now because it sounds the most soothing. I also added longer pauses after each sentence and used the Pydub library to add music to the background of the audio, but it only uses music from a particular file I downloaded for now.

Video Generation

I am now looking towards video generation. I am testing various models to generate videos with realistic visuals. The programs so far have produced low quality results or don’t run at all, so I am still searching for one adequate for producing quality visuals of a peaceful environment. One high quality model I have found is Stable Diffusion, but it only produces images. I could potentially use these to create short videos in the future. One high quality video model I’ve found is Infinite Nature. However, it’s difficult to incorporate into my program because it is an older repo that is not being actively maintained. I also plan to experiment with a CPPN to generate hypnotism-like visuals. I plan to dig deeper into the architecture and adjust it to control the color and speed of the output video, possibly by providing audio input in some way.