mohit @mohitwt_

19, building custom dl framework from scratch pluto0.streamlit.app Joined June 2025

Tweets

2K
Followers

371
Following

95
Likes

3K

mohit @mohitwt_

9 hours ago

> removed unflatten_index and rewriting math ops with stride-aware iteration now > this makes ops work on views (slice/transpose) without copying and replaces slow div/mod loops with cheap stride math > future ops (broadcasting, slicing, batching) will work without extra steps.

mohit @mohitwt_

2 days ago

1 0 24 1K 5

Download Image

1 1 15 308 1

Download Image

mohit @mohitwt_

a day ago

shifting to linux, amazing trick

Windows @Windows

2 days ago

shifting to linux, amazing trick

602 65 1K 178K 265

0 0 6 144 0

mohit @mohitwt_

2 days ago

> implemented 2D matmul with correct shape logic (%) > I also have to rewrite math ops to stride-aware so that tensors created by views (slice, transpose) can share same memory without copying. If ops only use flat indexing, those views break on non-contiguous tensors.

mohit @mohitwt_

2 days ago

3 3 53 3K 7

Download Image

1 0 24 1K 5

Download Image

mohit @mohitwt_

2 days ago

> added add, sub, mul, div for elementwise tensor math with full broadcasting support, so tensors of different but compatible shapes interact automatically. > added shared memory for views/reshape, letting tensors share data without copying. > framework can do basic math now.

mohit @mohitwt_

4 days ago

0 0 27 4K 6

Download Image

3 3 53 3K 7

Download Image

mohit @mohitwt_

3 days ago

my dl framework can do this too now btw (in c++)

mohit @mohitwt_

4 days ago

my dl framework can do this too now btw (in c++) https://t.co/Pngu8pfZAc

0 0 27 4K 6

Download Image

5 1 42 2K 4

Download Image

mohit @mohitwt_

4 days ago

> added metadata (dtype, device, owns_data) > added reshape to create views without copying memory. > improved printing to show dtype and device. > added flat and multidimensional indexing for direct + stride based element access.

mohit @mohitwt_

5 days ago

6 0 42 2K 7

Download Image

0 0 27 4K 6

Download Image

mohit @mohitwt_

4 days ago

incase if anyone needs it: github.com/Mog9/micrograd…

mohit @mohitwt_

6 days ago

incase if anyone needs it: github.com/Mog9/micrograd…

2 0 36 1K 4

Download Image

2 0 11 254 1

mohit @mohitwt_

4 days ago

java also known as javascript

1 0 8 151 0

mohit @mohitwt_

5 days ago

Implemented the first step and built the core skeleton of a tensor: > checks valid data vs shape > computes strides > track of grad storage (0 for now) > doesnt do any math yet > prints tensor in readable format.

mohit @mohitwt_

2 weeks ago

24 12 222 14K 105

Download Image

6 0 42 2K 7

Download Image

mohit @mohitwt_

5 days ago

3 0 7 223 0

Download Image

mohit @mohitwt_

6 days ago

used a different approach and it worked. the output for the equation gives correct forward and grad results. used to 2 equations and both gave correct output. the code can be made way better, but im still learning and its fine, made a working micrograd in c++

mohit @mohitwt_

6 days ago

4 3 39 2K 9

Download Image

2 0 36 1K 4

Download Image

mohit @mohitwt_

6 days ago

I tried to convert andrej karpathy's micrograd python code to c++ I would say 95% is working but there is still small issue where its giving wrong grad output, been on this for HOURS trying to fix

mohit @mohitwt_

7 days ago

I tried to convert andrej karpathy's micrograd python code to c++ I would say 95% is working but there is still small issue where its giving wrong grad output, been on this for HOURS trying to fix https://t.co/8BoDRPsl7D

1 0 19 3K 5

4 3 39 2K 9

Download Image

mohit @mohitwt_

6 days ago

for hours thinking there is smtn wrong with my code and thats why its not giving the gradient output but my dumbass didnt even print grad, i wanna kms 😭

1 0 13 230 0

Download Image

mohit @mohitwt_

7 days ago

the knowledge u get on how autograd/backpropagation works under the hood from this lecture is insane youtube.com/watch?v=VMj-3S…

1 0 19 3K 5

mohit @mohitwt_

4 weeks ago

Introducing Pluto: ML tool that lets you train any dataset in minutes right in your browser, with no coding required. Just upload a dataset, pick a target column, and run multiple models in one go, get results right away with plots. more models, hyper-parameters coming soon.

26 24 118 8K 68

Download Video

mohit @mohitwt_

a week ago

big 350

5 0 12 325 0

mohit @mohitwt_

a week ago

only few chapters left from learncpp, done with most of the concepts. need to understand/focus on few things before starting my project next week, gn.

0 0 8 209 1

mohit @mohitwt_

a week ago

🧵Thread on How Activation Functions Power Neural Networks: The core of what makes neural networks powerful we'll break it down step by step: why stacking only linear layers fails, how activation functions add non-linearity, how that changes what a network can learn, and more.

1 8 58 2K 35

Download Gif

mohit @mohitwt_

a week ago

both are very outdated, it can be made for very simple answers as it can't handle complex things, and gpt2 has a max context window of 1024 tokens so it would forget earlier convos and get messy.

ML Guy @soul_surfer78

a week ago

both are very outdated, it can be made for very simple answers as it can't handle complex things, and gpt2 has a max context window of 1024 tokens so it would forget earlier convos and get messy.