Skip to main content
Project

Sox

A long-running personal robotics project that moved from digital sculpture and CAD into a physical prototype with voice conversation and motion commands.

Year
2022-2024
Role
Builder
Team
Personal Project
Status
Completed
Sox robot cat concept render

Sox robot cat concept render

Physical Sox prototype with clear acrylic body, camera, LED eye panel, and labeled servo legs

Physical prototype for motion commands and verbal ChatGPT interaction

Digital sculpture concept for Sox

Digital sculpture concept / Blender

Sox CAD body and head assembly

CAD body and head assembly

Sox eye and neck motion design

Eye and neck motion design

Sox component cost breakdown

Component cost breakdown

Voice synthesis and wake word training

Voice synthesis and wake word training

Local GPU vision server setup

Local GPU vision server setup

Facial recognition test

Facial recognition test

Sox viewing angle reference

Viewing angle reference

I made the digital sculpture in Blender before I started CADing the parts, which gave me a better understanding of the dimensions and concepts.

I finished CAD for most of the body and head, though the mouth-draw motion still needs improvement. The image shown is from an early stage of the project.

The eye contains five servos so it can blink and rotate both vertically and horizontally. Two cameras will be mounted in the pupil for distance measurement and facial recognition.

The component and price list gives me a clear picture of which robot parts are still missing. The physical parts of the robot cost around $550.

I collected voice data from the movie and used it to train a customized voice model with Coqui TTS, so the robot could sound closer to the movie character.

I collected samples and trained a customized wake word in Google Colab, so the voice assistant can wake up with "Hey Sox."

I set up a GPU server to receive the robot's video stream over the internet and perform distance measurement and facial recognition, since the Raspberry Pi does not have enough compute power.

I finished the facial recognition code, so the robot can now stream video to the GPU server and tell whether the person in frame is in the database.