ibGib - a different approach to code and data

ibgib · November 20, 2017, 1:39pm

Cool!

tl;dr yes, absolutely.

I am still working on it (if I can keep my lonely last PC from no posting again), but I’ve removed my account from GitHub for several reasons.

As I’ve gotten older (I’ve been working on this for 15+ years), I’ve had to be cleverer in self-motivation as I’ve had at best scant outside interest. And so, one of the primary reasons for removing it was to remove my safety net of an online backup system because ibGib itself IS a “backup file system” - very much internally structured like git itself (only “more so”, as I was only vaguely aware of git’s internal magic and ibGib is entirely self-similar). But unlike diffing at the text-file level, ibGib “diffs” at the semantic transform level. So I’m currently doing what it takes to use ibGib itself as its own version control system. This requires me to implement two aspects: 1. A distributed file system component, and 2. The version control facade to ibGib’s engine.

Also, since I’m basically working on a competitor (as I imagine they would see it) for git and other VCSs, I thought it would be rude and probably against their TOS. (Although for me personally, if they were that interested I would love to work with them as I’m not so interested in any one particular use case for ibGib of which VCS is only one). So what this turns out to look like is very similar to things like git + IPFS/LD/__, and as I’ve discovered more recently, something called matrix, but leveraging my more abstract approach. I’d love to go into more detail about this side of things with you, but I don’t want to spam EF (and I have so many exciting ideas on how to proceed!). You’re welcome to correspond with me directly at ibgib@ibgib.com until I get notifications working in ibGib, at which time we can use ibGib itself. (We could use it now, but the lack of notifications makes it pretty tedious). Or perhaps a thread in the Lounge section here.

But anyway, I’m now walking a tightrope (with one end burning) as quickly as I can to get an MVP up for this functionality so that I can at the very least (in VCS parlance) init a repo, do an initial commit, push to my ibGib server node, init another node, and then pull to it. I’ve already had my trusty old PC die on me (8-core AMD FX-8350, sigh…), and this one no-posted this past weekend from a CMOS checksum error, but (Lord willing) I’m hoping to get this next version out by the end of the year.

Did you download any of the code previously or were you looking at it strictly online? I can post a snapshot of my current code somewhere if you (or anyone else) would be interested. It still is (and always will be AFAIK) “open-sourced”.

OvermindDL1 · November 20, 2017, 3:56pm

Nope, there are other VC’s sources on Github. As long as it is open source and not against the laws then Github does not really care. ^.^

ibgib · November 20, 2017, 4:25pm

Ah, interesting. Well perhaps only I didn’t want to be rude in my own personal view, or maybe the fact that their own source is not open-sourced (or wasn’t as of a year ago when I checked), nor is their issue tracking done in an open way (i.e. no dogfooding). Regardless, the lack of a safety net being impetus for focus was the primary motivating factor.

OvermindDL1 · November 20, 2017, 4:27pm

They have a business. ^.^

But no, a lot of their source is not open, though some of it is, not much though, yes it is weird… >.>

ibgib · November 20, 2017, 4:34pm

Indeed. That’s my point. The idea that business can’t exist with open-sourced software seems to contradict championing the stance for non-trivial open-source software.

When I learned this about them (I had just taken it for granted that their own code was open-sourced, as well as their issue tracking), I was more than a little disheartened. But still, they’ve been a great company. But the future is open source and open data.

Which is actually one of the most interesting aspects of ibGib’s architecture, as it is an open data format (similar to the OWL/triples but IMO better), where your code and data both live in a VCS-like system. But it turns out to make it scale, it looks like a biological mechanism. It’s actually very interesting, but then again - I’m biased.

dimitarvp · November 20, 2017, 11:29pm

Thanks for detailed and interesting answer.

I’ll indeed take it to private with you since we are gonna hijack the thread to an endless back and forth between us both.

Expect me Soon™.

OvermindDL1 · November 20, 2017, 11:30pm

Aww, I’m curious to see how this continues to develop!

dimitarvp · November 20, 2017, 11:42pm

Well, it might forever remain as a discussion and nothing would ever follow – hence the private message follow-up: I don’t want to spam the forum with theoretical chats and hand-waving. I work a lot and barely have enough time to enjoy the company of my fantastic spouse so I am not really looking forward to shrink my already minimal free time…

Then again, things change. I am working on making the same money with less hours spent and I definitely want to start contributing back – not just yet-another-generic-library-about-X but something much more impactful in the long term… like that ibGib thing… and IPFS, and next-gen torrents, and basically all MerkleDAG-ish / blockchain-like distributed databases.

If we figure we’re gonna do something for real however, we’re definitely gonna announce it.

OvermindDL1 · November 20, 2017, 11:43pm

That is why I keep saying we need a bikeshedding section. ^.^

ibgib · November 20, 2017, 11:58pm

Again, Cool! Although, just to be clear, I didn’t mean to imply any real “privacy”, as that is largely what ibGib is about - or not about as it were. The goal is to be able to do everything “in the light”, gearing everything around that - with the exception of knowledge of the “secret”. Basically how far can we take Kerckhoffs’ principle of “the enemy knows the system” (or maybe that is the Shannon formulation of it…)

So any possible communications that we would have that are currently “in the dark” are going to be “in the light”. There are many, many reasons for this, but again I don’t want to spam EF.

If @OvermindDL1 wants to participate, perhaps we can just do it in the Lounge since that doesn’t spam the home page with updates. Does that sound like an appropriate use case for the Lounge, @OvermindDL1?

dimitarvp · November 21, 2017, 12:03am

@AstonJ Your thoughts on the above message by @wraiford? We would like to chat but we don’t want to spam or annoy anyone.

AstonJ · November 21, 2017, 1:11am

If it’s to do with ibGib, then I’d say use this thread

If anyone isn’t interested in the contents of this thread… they can mute it and if we do later feel it’s an ‘issue’ we can always come up with something else later - but for now I can’t see any reason why you can’t continue to use the thread (unless there’s a specific reason you’d rather not?)

With regards to a bikeshedding section… I need to know about what kind of threads would go in there - so maybe start a thread in the feedback section with an outline of what you feel the section would be for, what kind of threads might be posted in it (and perhaps give an example of between 5 and 10 threads so I can see what sort of use it might get) and any other info (such as whether it should be restricted to members-only, etc). I’ll rack my brains then and see what we can best accommodate those kinds of threads

dimitarvp · November 21, 2017, 3:29am

Agreed, it’s just that in my past experiences as a moderator people don’t unsub and just blabber about “why won’t you get a room” and other BS, ignoring the very obvious fact that this is a forum.

Thanks for your permission!

ibgib · November 21, 2017, 3:16pm

ibGib is Belief

People nowadays think of “the data”. I want to run a query over “the data”. I want a transform over “the data”. Give me “the answer” to some problem. But this just isn’t scalable on the level that is about to hit the world with IoT plus AI, when we’re going to be talking about how to evolve and combine data among silos of data - think “BigGER DataS”. This data, and all data and metadata derived from it, is inherently imperfect existing in its limited, local projections of context. This sounds abstract, but it essentially means that ibGib is based on beliefs and not facts. It is not a knowledge graph, it’s a belief graph. This is why there is no single source of truth, as in event sourcing or conventional append-only transactional log-based databases. It’s not “the” database of truth, it’s “many” database of belief. This fits perfectly in the mold of functional programming’s approach of transforming data, which is why I called the primitive DNA ibGibs “transforms”. This approach requires you to step outside the box of the current CRUD/Transactional/ACID mindset or even “eventually consistent” CRDT-based data and start thinking independent, autonomous, AI-IoT-microservice-based ecological systems of beliefs. In fact, it may be easiest to conceive it as the datastore equivalent to microservices (I call them autonomous services though).

ibGib is Time

Currently there are a couple other approaches specifically looking to this, with AFAICT “Linked Data” (by Tim Berners Lee) being the frontrunner in popularity. I’m not an expert on it, and it’s got a lot of backing, but IMO it still lacks inter-relationship of data, and specifically the unification of data across time with a focus on provenance. Perhaps they’re addressing that at another layer, but for ibGib, the foundational data construct takes into account the most fundamental ideas of how any data projection can evolve over time. If you have a thing, over time that thing can change, in which case a new immutable snapshot is created with a past relationship pointing to the previous datum frame. If you create a “new” thing, then you are creating a new thing timeline, with an ancestor pointer to the source of the fork. Each ib^gib address is essentially an RDF triple URI. But each resource pointed to in that URI is itself an entire evolving ecosystem through time.

ibGib is Light

This may be the most difficult to swallow, but ibGib is about handling as much as possible in the Light. This is a stark contrast to current trends of anonymous, trustless-based systems such as cryptocurrencies, and more generally of encryption-based strategies of security. If you are looking for a black box technology whose focus is on subverting some authority, then ibGib is definitely not your solution. This stems ultimately from the belief that once you have a large enough entity, that entity will control the flow in and out of the black box. Solutions like end-to-end encryption all have the same flaw: someone actually has to program every piece of it, and they will be the one who holds power over you.

After researching its approach to security, I’ve concluded that ibGib is a viable alternative path to an encryption-based strategy - at the very least viable enough to continue forward. Providing more witnesses with integrity of any given event, i.e. minimal censorship and maximal transparency, will facilitate discovering the truth of that situation, and conversely, will makes it more difficult to hide or obfuscate it. And the nodal architecture is specifically designed to be AI/ML algorithm friendly.

Some Technical Details & Current Status

ibGib provides means for merkle tree-based projections of immutable data records.
- This creates an immutable Directed Acyclic Graph (DAG) at the superficial level of records, but a cyclic graph through time.
  - For example, an ibGib A data frame at any single point in time cannot point to itself at its current time. So A5 can point to A4, but A5 cannot point to A5 (or any other ibGib BX that points to A5). But since A5 can point to A4 (it does this in the past relationship after all), cyclic relationships such as “self” are possible.
  - This is just like you can’t have a “memory” of yourself at the point in time of yourself having the memory, nor can anyone else have that memory of you at that point in time.
ibGib is content-addressable
- Each of these records has a pointer comprising its ib and gib, which are somewhat silly-sounding names but with clear purposes. The ib provides for flexible metadata that can change depending on your use case. the gib provides the hash which enables the integrity checking of the record (including the ib field!).
ibGib will be distributed, but currently is not.
- Currently there is a single server ibGib node, the one running www.ibgib.com . I’m working on the ability to create a local client node that will be able to parse a folder, map the child files and folders to ibGib, track those files like a VCS, and push to the server node.
- The VCS basics of FS -> ibGib is implemented, the ibGib -> FS is .
  - I have not implemented the merging yet, as this is a non-trivial process. I’ve done experiments, but it will be awhile before something like the power of git, tfs, mercurial, subversion (still in use?), etc.
  - For dogfooding purposes, this will become a higher priority if anyone wishes to invest their own coding time. It would be an enormous step up on productivity, if you can imagine being able to push/pull your FS for IoT/AI code just like you do currently with your text-based source code.
- Currently working on the inter-nodal communication aspect, i.e. “push”/“pull”.
- The ibGib data construct is working excellently for this use case, as at each point it is simply an exchange of ibGib data records. It has integrity built in to the communication, because of the ibGib internal structure.
- The communication is an open-ended expansion of a many-times signature scheme, very much similar to Winternitz One-Time Signatures (WOTS) and more specifically like SPHINCS - which are post-quantum, stateless hash-based signature schemes. But the ibGib approach allows key differences (no pun intended).
  - Since the ibGib internal structure is already protected by hashes, the only thing that must be “signed” is the transaction manifest. This allows for automatic “batch signing” of data.
  - There is no need to intermittently increase a buffer for the many-times signature scheme. Each transaction itself both consumes and produces signature challenges. But I digress…it’s exciting to talk about because I’m working on this at this very instant, and it’s yet another aspect that ibGib’s design dovetails nicely with.
- Once this is complete, in addition to very basic VCS capability, we’ll be able to have IoT devices create their own nodes, and these will be able to “push” ibGib to wherever they’re configured, e.g. to the ibgib website or someone else’s node.
  - Each node is responsible for its own node secret.
  - Local VCS “repo” nodes have node secret akin to a user’s password for their private key.
  - IoT devices could be made so that some event, like a reboot or intrusion detection, could create a new node secret, and thus a new node identity.

So, I hope I’ve cleared up some of the specifics on what ibGib is and is not about, as well as providing some technical details, the current status, and a tiny glimpse into the future.

Any questions, @dimitarvp, @OvermindDL1 or anyone else? There are many more things I could speak to, for example, the differences in approach to some of the other technologies. Like how a DNS-like node is achieved, how AI/ML plugs in for data analytics, and more. One of the most exciting things for me personally is consequences to the querying issue. What does it mean to “query” in a dataset that is by definition larger than you’re able to look at with any one given query?! I think the ibGib approach is very exciting! (Hint: it’s related to the AI/ML/autonomous services aspect). Obviously the leaving GitHub would make contribution a little awkward at the moment, but I’d be glad to provide anyone an old-timey copy of the code (which still has the git repo!) until I can get the VCS stuff pushed.

OvermindDL1 · November 21, 2017, 3:27pm

Honestly I’d just flag such posts… ^.^;

I miss seeing your github repo develop, I was wondering why I’d not seen it in my notifications for a long while… ^.^;

ibgib · November 21, 2017, 3:35pm

Thank you! Unfortunately, you were I think the only one.

Not to fret. As soon as I get the new version up, I’ll post a link. For now, here is my dogfooding VCS issue. It is not just a flat list of items. Many of those ibGib themselves have children, but unfortunately the current UI doesn’t show children (equivalent to the little plus sign next to folders, e.g.) Just imagine a more expressive issue tracker + comment system that so tightly integrates with the VCS of the code itself!

Not to mention once we get docker images going that facilitate server/engine nodes, and exchanges like that. It’s like an entire web of personal GitHub websites that can communicate with each other, with the functionality of Twitter, Instagram, and more! All of this is very feasible.

OvermindDL1 · November 21, 2017, 3:47pm

Sadly your certificate is invalid (expired?!), cannot access the URL (and cannot override because of work requirements). ^.^;

ibgib · November 21, 2017, 4:24pm

Yeah, unfortunately couldn’t update it and I’m still unsure of the free alternatives (I know, many say they’re good…). I am toying with the idea of simply making it easier to create donate ibGib that point to outside sources. If anyone wants to donate to ibGib in the meantime, I do have a paypal.me page (Edit: any donations would be anonymous, not like brand recognition stuff - left hand, right hand…). The cert renewal would "cert"ainly (ouch) be one of the priorities right now.

As an aside, this is actually another use case for ibGib’s communication mechanism, because it wouldn’t rely on SSL (or any PKI) as it’s a hash-based stateless scheme, which would address the very near future issue of quantum computers breaking hard problems like factoring (foundation of PKI). Someone else could easily create an encryption layer on top of it, but again, I personally am not going that route for ibGib. Anyway, that’s really non-trivial, heavy stuff. SSL is great for now.

OvermindDL1 · November 21, 2017, 4:25pm

Aren’t you using Let’s Encrypt, if not then you should. There are auto-updaters for it for about every web server out. ^.^

ibgib · November 21, 2017, 4:27pm

I knew you were going to say that! That falls under the “I know, many say they’re good”!!