Haha that’s awesome to hear! I appreciate it.
That’s a good question. The short answer is that I added the “fully finished” generate_fn in the beginning so I could reuse it for the singlehead / multihead / fully-finished models 
But, you’re right, the bigram model doesn’t need to truncate the input because the predicted next char doesn’t rely on context. Every char already has a mapping to its likely next char in the [65][65] embedding kernel
- t is likely to produce h
- h is likely to produce e
…
But, if you try out the non-truncating generate_fn on the single head model, you’ll see this gibberish:
S:
Babthat tebrsb udinmipfsm enryarlrent cweetetpenycro oorctsstsnll miiesor in vutt? n
Crcewtoemi beoercstosl;dercta 'incasesptb wnf n f ntr',,kntn,iaessdisuyt asbitsfdcolultoitem i'osamner-mesiidr krestkuof'osdst. soutrjo la citeeemiktinavbeimtidelgkdtom ir, et, n rksyet.eyr? wstnserrrssrdewpnititoe'sirbrst i ' m lam Tseht. mt diemrt qsspiswtatroira; rfwtocycct psoubefemt mtne' w a sfrrn. setetea acaoslt selfta.
aytdestinteet,eehiyploeyiticosa snun' g
Itit nisvin eap rohuwprcoa rtto pn rhteteaepevit sho ta wdkilsrtsito ha quvb; lumorldwxetoteri'
nwsfetenaseaxesllelyeeo! lo; retiameirlrsitto rsnrcihueaaeddrdda , Tt-epinmemeb gdusifkto: tlrremnert J nr ltsb rssaorisscoa .,!enedsasdmwa R I ha rrrf nwt. duiheissyrthtprereeuid. tl yt ea vunikiicsisshmjarrpreoretrei t ira,yeyyera enoaastewrdelmocylidtrnmsU
verpseratseii? Lit atoedswottea srm' asfoyyila? pviyurnyeyigcyemt wepimirt tfwmaymo stsfolsdulfa I wsunak' vfepaa spktsytteekhopmanb' Lepswa O aegcino;elrmogymua had lei
vs the original
S:
Babther te st bud hean! fad ry sis lr,
an withanu sr mou toous miche erer fo wous in
Lou byoun, I mine sthon omf th Cor Il, oum win m.
Fro hepith sanll hitro Il ous ecist wirs hal st os nchat har hinlost, art st fpivis, be Pans che-Cre, yan!
Th, headise foacer ch orsu thive ivisen prourilay, ksese.
QPUESTV:
'
[...]
Goodluck with learning the transformer architecture. I think implementing it in elixir helped cement the math behind it more, so even though it took a long time, I found it worthwhile. Anyways I hope that helped!