// the digital futurist · est. 2020 · chennai, india
Thomas Cherickal
Generative AI Consultant · Generative AI Engineer · SLM Engineer · Rust AI Engineer · Independent Research Blogger
I build at the bleeding edge of Generative AI, AI systems engineering, cross-domain expertise, SLM deployment, and technical research storytelling. From engineering local LLM stacks to writing 10,000-word deep dives on frontier models: I turn complexity into clarity and code into production value. My mission: make Generative AI, Blockchain AI, Quantum AI, and Local AI accessible to everyone. I believe Rust is the future of AI and programming, and I want to be an early adopter and work as a Rust AI Engineer, and I dream of rewriting the AI stack and the Generative AI stack in Rust. I am also fluent in Claude Code and Google Antigravity.
Who I Am
Roles & Capabilities
I operate across multiple disciplines — from deep technical development to strategic content and AI mentoring.
Generative AI Consultant
Delivering end-to-end AI transformation: office process automation with Python, website creation, custom AI app development, SEO/AEO/GEO strategy, and individual AI upskilling programs.
Generative AI Engineer
Designing, building, and deploying intelligent Generative AI systems, LLM-powered applications, and custom agentic workflows. Using AI Tools to optimize workflows programmatically.
Rust AI Engineer
Building high-performance, memory-safe AI systems and systems tooling in Rust — from writing optimized tokenizers and tensor operations to developing blazing-fast local LLM runtimes, custom Python/Mojo bindings, and SIMD-accelerated libraries.
SLM Engineer
Optimizing Small Language Models for local inference on constrained hardware. Experienced with LM Studio, Ollama, llama.cpp, Gemma 4 E4B, and Qwen — tuning quantized models for CPU-only environments with minimal VRAM.
Open Source Developer
Building production-ready intelligent systems in Rust, Python, and Golang — including local LLM orchestration pipelines, AI agent frameworks, and SLM inference engines optimized for constrained hardware.
AI Mentor & Trainer
Working 1-on-1 with individuals and teams to accelerate adoption of frontier AI tools — prompt engineering, local LLM setup, agentic workflows, AI career development, and online brand building.
Independent Research Blogger
Investigating the latest in LLM benchmarks, agentic frameworks, quantum AI intersections, open-source tooling, and hardware acceleration under the brand The Digital Futurist.
Technical Content Writer
Crafting meticulously researched long-form articles, developer tutorials, comparison deep-dives, and strategic technology playbooks across HackerNoon, Medium, Substack, Hashnode, Dev.to, and Differ. Working on the book RECRUITED.
Website Builder
Designing and deploying professional websites across every major platform — WordPress (thomascherickal.com), Wix, Framer, SITE123, GitHub Pages, and more. From landing pages to full personal brand sites with SEO, AEO, and GEO baked in.
Online Brand Builder
Building digital visibility for professionals and students across LinkedIn, HackerNoon, Substack, GitHub, and personal sites — from zero to recognized voice in their niche.
Core Expertise
Tech Stack & Domains
⚙️ Languages
🤖 AI & ML
🛠️ Local AI Stack
🗄️ Data & Storage
📚 Domains
🧰 AI Tools
🔧 Dev & Infra
🌐 Web Dev
✍️ Content
🚀 Deployment
Open Source
Developer Showcase
Building at the intersection of AI, systems programming, and developer tooling. Upcoming repos.
// Local SLM agent with tool-calling & streaming use openfang::agent::{Agent, Config}; use lm_studio::client::LMClient; async fn run_local_agent() -> Result<()> { let config = Config { model: "gemma-4-e4b-it".into(), provider: "lm_studio".into(), stream: false, // stable on 2GB VRAM max_iterations: 100, ..Default::default() }; let agent = Agent::new(config).await?; agent.run("Summarise today's AI news").await }
# Benchmark local SLMs on constrained hardware from openai import OpenAI client = OpenAI( base_url="http://localhost:1234/v1", api_key="lm-studio" ) def benchmark_model(model_id: str, prompts: list): results = [] for p in prompts: resp = client.chat.completions.create( model=model_id, messages=[{"role":"user","content":p}], stream=False ) results.append(resp.choices[0].message.content) return results
// OpenAI-compatible gateway with multi-tenant auth package main import ( "net/http" "net/http/httputil" ) func ProxyHandler(w http.ResponseWriter, r *http.Request) { tenant := r.Header.Get("X-Tenant-ID") model := resolveModel(tenant, r) r.Header.Set("X-Model-Override", model) proxy := &httputil.ReverseProxy{ Director: buildDirector(tenant), } proxy.ServeHTTP(w, r) }
# Fused matmul kernel — MAX Engine target from tensor import Tensor, TensorShape from algorithm import vectorize, parallelize fn fused_matmul[dtype: DType]( a: Tensor[dtype], b: Tensor[dtype] ) -> Tensor[dtype]: var out = Tensor[dtype](TensorShape( a.shape()[0], b.shape()[1])) parallelize[compute_row](a.shape()[0]) return out
// SIMD cosine similarity over f32 embeddings use std::arch::x86_64::*; pub fn cosine_sim(a: &[f32], b: &[f32]) -> f32 { let (dot, na, nb) = a .chunks_exact(8) .zip(b.chunks_exact(8)) .fold((0.0, 0.0, 0.0), |acc, (xa, xb)| { unsafe { avx_dot_step(acc, xa, xb) } }); dot / (na.sqrt() * nb.sqrt()) }
thomascherickal / slm-inference-engine
High-performance Rust library for local SLM inference on CPU-only hardware. Optimized for 2–8 GB RAM targets with streaming support.
thomascherickal / ai-career-toolkit
Python scripts, prompts, and automation tools accompanying the RECRUITED book — NotebookLM workflows, resume AI analysis, LinkedIn audit scripts.
thomascherickal / quantum-ai-experiments
Research notebooks exploring QAI intersections — Qiskit, Q#, Quantinuum stack experiments, and quantum threat modeling for blockchain systems.
thomascherickal / digital-futurist-site
Open-source GitHub Pages site with JSON-LD schema, SEO metadata, dark-mode design, and cross-platform social integration. The page you're reading now.
thomascherickal / rust-llm-router
Zero-latency request router for multi-provider LLM APIs with automatic failover, rate-limit awareness, and cost-optimized model selection across OpenRouter, Ollama, and LM Studio.
thomascherickal / vector-forge
High-performance SIMD-accelerated vector operations library for embedding pipelines — cosine similarity, HNSW indexing, and ANN search at native speed for RAG workloads.
thomascherickal / ai-cli
Terminal-native AI assistant built in Rust — streaming completions, tool-calling, local model support via llama.cpp bindings, and a plugin architecture for custom commands.
thomascherickal / neural-bench
Automated benchmarking suite for local LLM inference — latency, throughput, TTFT, and quality metrics with auto hardware detection and interactive leaderboard dashboard.
thomascherickal / agent-memory-kit
Long-term episodic and semantic memory layer for AI agents using LanceDB and Nomic embeddings — pluggable into LangChain, LlamaIndex, or any agentic framework.
thomascherickal / mojo-python-bridge
Seamless Python ↔ Mojo interop toolkit — call Mojo kernels from Python notebooks, share NumPy-compatible buffers, and hot-swap critical inference paths for 10–50× speedups.
thomascherickal / llm-gateway
Production-grade OpenAI-compatible API gateway in Go — multi-tenant auth, request logging, model aliasing, streaming proxying, and Prometheus metrics out of the box.
thomascherickal / go-agent-sdk
Lightweight Go SDK for building tool-calling AI agents — structured outputs, parallel tool execution, retry logic, and first-class support for OpenRouter and Ollama backends.
thomascherickal / mojo-tensor-ops
GPU-accelerated tensor operations library written in Mojo — fused matmul, flash attention kernels, and BFloat16 support targeting MAX Engine and NVIDIA consumer GPUs.
thomascherickal / mojo-llm-kernels
Hand-tuned Mojo implementations of LLM attention, KV-cache, and RoPE embeddings — designed as drop-in replacements for CUDA ops on Modular MAX, with hardware-adaptive dispatch.
Featured Writing
Published Articles
Long-form technical content published on HackerNoon, Medium, Substack, Hashnode, LinkedIn, and more.
🤖 Generative AI, LLMs & Agents
⚛️ Quantum Computing & QAI
🔗 Blockchain & Web3
🔮 Futurism & AGI
🦾 AI Agentic Assistants
🛠️ AI Tools & Productivity
Published Works
Books & Long-Form
From 8,000-word deep dives to full-length career playbooks — substantive work for serious readers.
RECRUITED
The AI-Powered
Career Playbook
for 2026
RECRUITED
The AI-Powered Career Playbook for Professionals Who Refuse to Be Left Behind
A comprehensive transformation system showing professionals how to use frontier AI tools — ChatGPT, Claude, Gemini, NotebookLM, Perplexity — to accelerate their job search, rebuild their brand, and land roles that actually match their ambition. Real frameworks. Real tools. Real results.
// New Digital Products Released Every Week
Let's Build Together
Collaboration Services
Open to meaningful partnerships across writing, development, and AI education. Reach out on LinkedIn for a free connect, chat, and consultation.
Online Sales & Digital Products