ibGib is Belief
People nowadays think of “the data”. I want to run a query over “the data”. I want a transform over “the data”. Give me “the answer” to some problem. But this just isn’t scalable on the level that is about to hit the world with IoT plus AI, when we’re going to be talking about how to evolve and combine data among silos of data - think “BigGER DataS”. This data, and all data and metadata derived from it, is inherently imperfect existing in its limited, local projections of context. This sounds abstract, but it essentially means that ibGib is based on beliefs and not facts. It is not a knowledge graph, it’s a belief graph. This is why there is no single source of truth, as in event sourcing or conventional append-only transactional log-based databases. It’s not “the” database of truth, it’s “many” database of belief. This fits perfectly in the mold of functional programming’s approach of transforming data, which is why I called the primitive DNA ibGibs “transforms”. This approach requires you to step outside the box of the current CRUD/Transactional/ACID mindset or even “eventually consistent” CRDT-based data and start thinking independent, autonomous, AI-IoT-microservice-based ecological systems of beliefs. In fact, it may be easiest to conceive it as the datastore equivalent to microservices (I call them autonomous services though).
ibGib is Time
Currently there are a couple other approaches specifically looking to this, with AFAICT “Linked Data” (by Tim Berners Lee) being the frontrunner in popularity. I’m not an expert on it, and it’s got a lot of backing, but IMO it still lacks inter-relationship of data, and specifically the unification of data across time with a focus on provenance. Perhaps they’re addressing that at another layer, but for ibGib, the foundational data construct takes into account the most fundamental ideas of how any data projection can evolve over time. If you have a thing, over time that thing can change, in which case a new immutable snapshot is created with a past
relationship pointing to the previous datum frame. If you create a “new” thing, then you are creating a new thing timeline, with an ancestor
pointer to the source of the fork. Each ib^gib address is essentially an RDF triple URI. But each resource pointed to in that URI is itself an entire evolving ecosystem through time.
ibGib is Light
This may be the most difficult to swallow, but ibGib is about handling as much as possible in the Light. This is a stark contrast to current trends of anonymous, trustless-based systems such as cryptocurrencies, and more generally of encryption-based strategies of security. If you are looking for a black box technology whose focus is on subverting some authority, then ibGib is definitely not your solution. This stems ultimately from the belief that once you have a large enough entity, that entity will control the flow in and out of the black box. Solutions like end-to-end encryption all have the same flaw: someone actually has to program every piece of it, and they will be the one who holds power over you.
After researching its approach to security, I’ve concluded that ibGib is a viable alternative path to an encryption-based strategy - at the very least viable enough to continue forward. Providing more witnesses with integrity of any given event, i.e. minimal censorship and maximal transparency, will facilitate discovering the truth of that situation, and conversely, will makes it more difficult to hide or obfuscate it. And the nodal architecture is specifically designed to be AI/ML algorithm friendly.
Some Technical Details & Current Status
- ibGib provides means for merkle tree-based projections of immutable data records.
- This creates an immutable Directed Acyclic Graph (DAG) at the superficial level of records, but a cyclic graph through time.
- For example, an ibGib A data frame at any single point in time cannot point to itself at its current time. So A5 can point to A4, but A5 cannot point to A5 (or any other ibGib BX that points to A5). But since A5 can point to A4 (it does this in the
past
relationship after all), cyclic relationships such as “self” are possible.
- This is just like you can’t have a “memory” of yourself at the point in time of yourself having the memory, nor can anyone else have that memory of you at that point in time.
- ibGib is content-addressable
- Each of these records has a pointer comprising its
ib
and gib
, which are somewhat silly-sounding names but with clear purposes. The ib
provides for flexible metadata that can change depending on your use case. the gib
provides the hash which enables the integrity checking of the record (including the ib
field!).
- ibGib will be distributed, but currently is not.
- Currently there is a single server ibGib node, the one running www.ibgib.com . I’m working on the ability to create a local client node that will be able to parse a folder, map the child files and folders to ibGib, track those files like a VCS, and push to the server node.
- The VCS basics of FS -> ibGib is implemented, the ibGib -> FS is
.
- I have not implemented the merging yet, as this is a non-trivial process. I’ve done experiments, but it will be awhile before something like the power of git, tfs, mercurial, subversion (still in use?), etc.
- For dogfooding purposes, this will become a higher priority if anyone wishes to invest their own coding time. It would be an enormous step up on productivity, if you can imagine being able to push/pull your FS for IoT/AI code just like you do currently with your text-based source code.
- Currently working on the inter-nodal communication aspect, i.e. “push”/“pull”.
- The ibGib data construct is working excellently for this use case, as at each point it is simply an exchange of ibGib data records. It has integrity built in to the communication, because of the ibGib internal structure.
- The communication is an open-ended expansion of a many-times signature scheme, very much similar to Winternitz One-Time Signatures (WOTS) and more specifically like SPHINCS - which are post-quantum, stateless hash-based signature schemes. But the ibGib approach allows key differences (no pun intended).
- Since the ibGib internal structure is already protected by hashes, the only thing that must be “signed” is the transaction manifest. This allows for automatic “batch signing” of data.
- There is no need to intermittently increase a buffer for the many-times signature scheme. Each transaction itself both consumes and produces signature challenges. But I digress…it’s exciting to talk about because I’m working on this at this very instant, and it’s yet another aspect that ibGib’s design dovetails nicely with.
- Once this is complete, in addition to very basic VCS capability, we’ll be able to have IoT devices create their own nodes, and these will be able to “push” ibGib to wherever they’re configured, e.g. to the ibgib website or someone else’s node.
- Each node is responsible for its own node secret.
- Local VCS “repo” nodes have node secret akin to a user’s password for their private key.
- IoT devices could be made so that some event, like a reboot or intrusion detection, could create a new node secret, and thus a new node identity.
So, I hope I’ve cleared up some of the specifics on what ibGib is and is not about, as well as providing some technical details, the current status, and a tiny glimpse into the future.
Any questions, @dimitarvp, @OvermindDL1 or anyone else? There are many more things I could speak to, for example, the differences in approach to some of the other technologies. Like how a DNS-like node is achieved, how AI/ML plugs in for data analytics, and more. One of the most exciting things for me personally is consequences to the querying issue. What does it mean to “query” in a dataset that is by definition larger than you’re able to look at with any one given query?! I think the ibGib approach is very exciting! (Hint: it’s related to the AI/ML/autonomous services aspect). Obviously the leaving GitHub would make contribution a little awkward at the moment, but I’d be glad to provide anyone an old-timey copy of the code (which still has the git repo!) until I can get the VCS stuff pushed.