TABULA
TABULA is an experimental work-in-progress project aimed at developing a multi-model AI cognitive brain capable of autonomous learning through real-world interactions. The system is designed to form thoughts, be motivated by curiosity, and develop its own symbolic representations for objects, experiences, and actions through its auditory and visual cortex modules.
- Create an AI system capable of forming independent thought processes
- Develop curiosity-driven learning mechanisms
- Enable autonomous symbol creation and representation from sensory inputs
- Build self-learning capabilities similar to human cognitive development
The project currently focuses on two primary sensory processing systems:
The auditory cortex provides the model with audio embeddings necessary for pattern recognition and symbol creation in memory. Key components include:
- Disentangler: Separates voice and noise into distinct processing paths for downstream analysis
- Decoder: Reconstructs clean voice and noise signals from disentangler embeddings back into waveform
- Designed for future self-learning speech module development
- Currently usable for voice/noise separation and cleaning
- Oneshot Inference: Combines disentangler and decoder to produce clean voice and noise outputs from audio/video files
- Pattern Recognition Models:
- Protophoneme: Identifies fundamental sound patterns
- Protoword: Learns word structures from audio patterns
- Protogrammar: Develops grammatical structures mimicking infant language acquisition
The visual cortex implements a fovea-inspired computer vision system designed for computational efficiency:
- Foveal Focus: Automatically concentrates processing power on areas of interest with higher detail
- Peripheral Motion Detection: Captures movement and changes in the peripheral field
- Dynamic Attention: Automatically shifts focus points based on attention requirements
- Compute Optimization: Maintains fast, efficient processing while preserving visual awareness
Coming soon - Installation instructions will be provided as components reach stable releases
- Disentangler Model Architecture - Specialized audio source separation system
- Decoder System - Multi-decoder system for audio reconstruction
- Decoder Extension Guide - Guide for extending decoder capabilities
- Decoder Module Structure - Detailed module architecture
- Utility Functions - Supporting utilities documentation
- Disentangler Training - Training procedures for disentangler
- Enhanced Decoder Training - Advanced decoder training strategies
- Oneshot Inference Pipeline - End-to-end audio/video processing
- Decoder Inference - Decoder-specific inference guide
- Modular Structure - Visual cortex training architecture
- Training Strategy - Visual cortex training approach
For practical usage of the current components:
Audio/Video Separation: See the Oneshot Inference Pipeline for separating voice and noise from audio or video files
Model Training: Refer to the training documentation for each component:
Architecture Details: For understanding the system architecture:
This is an experimental research project. Contributions and ideas are welcome. Please open an issue to discuss major changes before submitting pull requests.
MIT License
š§ Work in Progress - This project is in active experimental development. Components and APIs may change significantly as the architecture evolves.