All Posts

Writing and Translating Help Files - Spwig

You may have read a bit about the exciting new project I am working on - Spwig - an all inclusive e-commerce platform.   One thing that has been exploding my ego over the last few months as I progress the development on Spwig is the bewilderment and ...

AI integrated e-commerce platform Spwig

Another wee side project I’m working on is an e-commerce platform. I’m tired of paying for subscriptions, plugins that break and slow my store down and wanted to build something that I would use and offer it out at a good FIXED price. I’m not greedy....

My AI Rig

My AI Setup - Power, Efficiency, and Color As an AI consultant and model creator, my tools are my partners in experimentation, discovery, and creation. Over time, I’ve built a nice little ecosystem of devices that allow me to train, test, and store A...

Disentangler and Decoder Outcome

After nearly a month of training, refactoring, fine tuning and losing hair, I hit a wall with my disentangler / decoder models. No matter what I did, no matter how sophisticated I made the decoder, I could not get it to reconstruct the high-frequency...

Fovea Vision for Computer Vision

The other morning I was sleeping in a little while the morning sky was still dark. My wife had woken up earlier than me to tend to our boy, and at some point needed to turn on the lights in our bedroom. Something very brief but very cool happened whe...

Rebuilding Sound from its DNA

Rebuilding Sound from its DNA In my audio processing pipeline, I've two components that work together to transform messy, mixed audio into clear voice and noise streams. The Disentangler is responsible for disentangling voice and noise from audio and...

Update On The Disentangler Model

After training I found a few issues with the disentangler model (designed to separate noise and voice for downstream consumption) The model has multiple output heads, including raw embeddings for voice and noise, PCEN embeddings for voice and noise a...

Building The Auditory Cortex

Building The Auditory Cortex I'm not sure now how many iterations of model design and redesign I have been through for the auditory cortex, but as I learn more and get a stronger understanding of how the brain interprets audio signals, how these are ...

The Path to AGI - Pt 3

The Path to AGI - Pt 3 I'm making progress on the synthetic auditory model. I had to make some design changes to the architecture along the way, as one would expect but the core principles remain unchanged and I'm making good progress. Right now the ...

The Path to AGI - Pt 2

I started shaking as my tests of my new synthetic audio cortex model showed early signs of achieving self learning.. I expected to need a lot more tweaking and perhaps weeks, if not months of trial and error to find the right way to build the archite...

The Path to AGI

Working on my P.A.S.T.A model with quite a satisfying degree of success, I felt something was off. I am feeling that my time spent developing training data, code and training the models has been diverting me from what I originally wanted to achieve. ...

T5 ASR Grammar Corrector

For my master project, Curious AI, I had trialed several VAD and ASR models to try and find the right fit for the live WebRTC real-time streaming architecture.  After a lot of benchmarking and tweaking, I settled on pyannote for VAD and NVIDIA NeMo f...

Cognitive Memory System Setup & Test Scenarios

Overview This document outlines the architecture and behaviour of the Cognitive Memory System, designed to replicate human-like memory functions for real-time AI contextual awareness. The system uses a multi-tiered approach to manage snapshots of aud...

The Three Brains Concept

A Framework for Self-Improving, Curiosity-Driven AI A Low-Cost Approach to Autonomous Learning, Reasoning, and Real-World Interaction Dayyan James dayyan.james@gmail.com Abstract This paper presents a framework for designing a low-cost, self-improvin...

Welcome to DJ-AI

A New Frontier in Real-Time Intelligence   At DJ-AI, I'm looking to build a real-time, multi-modal artificial intelligence, where perception, understanding, and memory converge. My mission is to create systems that not only see and hear, but comprehe...