Explore beyond the Known

January 11, 2024 - Reading time: 7 minutes

This is a somewhat-tech focused post but the spirit of the post tells the story that even though you sometimes know what you need to do, taking the time to explore a known (or unknown) promotes new ideas and opportunities.

At Intergreatme we're currently working on our new SME portal. I have been spending some time thinking about a conceptual database design. A lot of the current design work has been focused on the prototype I built in a couple of weeks (using PHP) that was repurposed to work with a larger project between Telkom, Lalela and Angelshack. Having said that, as the solution that is being built goes beyond the current transaction-only approach we have when dealing with RICA transactions, as the core focus of this portal is on FICA.

It's not secret that we use CockroachDB as our database of choice, although I do like to also use SQLite3 for building a quick prototype or analysis work - something I still make use of today (and choose modernc.org's sqlite implementation.)

While I was prototyping some code around the use of blind indexes specifically using SHA256 as the hashing method. This lead me down two different paths of research:

  1. Whether the use of SHA256 is appropriate as a blind index for what should be distinct item, like an email address;
  2. What the appropriate data type should be to store this information in CockroachDB

I had previously read up on CipherSweet, a cross-platform library that provdes searchable field level database encryption. We already make use of blind indexes at IGM, but I wondered if there isn't a better way for us to be doing this. It's through some basic searches and prompts to ChatGTP (including CipherSweet's documentation on Blind Indexing) that I also came across the BLAKE2 (and for x64 platforms, BLAKE2b) algo.

Secondly, I started to look at the appropriate data type to use to store this kind of information, and naturally, looked at the CockroachDB docs. We also use UUID's as our unique identifier, generally because the use of a sequentially incrementing ID isn't advised for a clustered database.

Reviewing the documentation on for ID Generation Functions and I noticed there is a ULID function. I've never heard of ULID before, so I figured I'd task ChatGTP to provide me with a summary of what it is as a quick-win way of finding information without needing to search for it.

ULID (specification) is a Universally Unique Lexicographically Sortable Identifier. I'm not going to go in to the properties of what it is, or the differences between UUID and ULID. But what I like about it is that it provides a timestamp component as well as an element of randomness to generate a unique identifier. This is pretty useful in terms of generating unique identifiers that have time as part of the identifier, as we work on an insert-only approach to storing data (which also presents its own challenges when it comes to performance when dealing with millions of records in a table.)

In a second example, I was watching a YouTube video about SvelteKit and Golang when I came across HTMX, which then led me to see Hyperscript (when checking out file upload functionality included in HTMX). This then led me to also find Templ, which is notoriously difficult to locate using search engines, as a different way to build UI's outside of the use of the html/template standard library.

This then led me to find the Hotwire library, which comes from 37signals' and that honestly reminds me of how I would build applications with xajax back in the day.

After some experimentation with HTMX, and the web component library Shoelace.style as well as Hotwire, I have slowly started using Hotwire over HTMX if only because HTMX had some issues with the Shoelace inputs not allowing HTML5 input validation running, as well as not sending the POST data from the form to the server. I'm sure that HTMX does support this, but it doesn't really seem to work "out the box", at least, not with my simple trial and error experiment.

The point?

If I hadn't have taken the time to do some expansive research from a particular topic, I would not have found these other ways of doing. I wasn't specifically looking for BLAKE2, ULID, or even the ID generation functions in CockroachDB. If I wasn't curious about how people are using SvelteKit 2 with Go, I'd never have found HTMX, Hyperscript, Templ and Hotwire. The outcomes were always better than expected: not only did I learn something new, but these are now also possible items for us to discuss and determine if they suit our specific use cases in a way that supports our application design.

As a side note:

Ironically, I have already experimented with building a similar style identifier that provides a timestamp and variable-length random sequence as part of some R&D as part of a link hydration/dehydration strategy. Something else I've learned as part of this process is the use of Crockford's Base32 (used in ULID generation) in which the alphabet excludes the letters I, L, O, and U to avoid confusion and abuse. Of course, for my needs as part of a link hydration/dehyration process, this is largely irrelevant. But still, something learned just from a curious question.