Short links

October 26, 2022 - Reading time: 6 minutes

Popularised by services like bit.ly, short links provide a means of reducing the length of a URL to make it easier to deal with character limit constraints imposed by technology, like SMS or Twitter. I've written this article to act as a sort of go-to as one of my clients wanted to know an easier way to convert a long-form URL to a shortened one for easier distribution.

Shortening links effectively follows a dehydrate - hydrate process, where a unique key is created that is used to store a value. This allows us to shorten a long link like this: https://kyc.intergreatme.com/za/igm.demo/?txId=xxxxd42b-6633-457f-a4d8-9b9d07cdxxxx&originTxId=yyyy656d-2536-4573-82dd-cc26ff09yyyy to https://go.intergreatme.com/xyd82c

Note: the links in this article are for illustration purposes only and do not link to anything in the Intergreatme environment. I also use tx_id and origin_tx_id whereas the URI uses txId and OriginTxId.

When the user navigates to our short link, the code will do a lookup on the key and then rehydrate the value as necessary. Once the link has been rehydrated, an HTTP redirect is used to navigate the user to the intended destination.

The KYC demo at Intergreatme follows this link shortening process to create a smaller URI to be sent via SMS.

The dehydration strategy involves taking our unique identifiers (the tx_id and origin_tx_id) and combining them in to a smaller short_id.

<?php
    if(!empty($_GET)) {
        $short = array_keys($_GET)[0];
        try {
            $source = 'ACCESS_UNKNOWN';
            if(isset($_GET['u'])) {
                if($_GET['u'] == 'm') {
                    $source = 'ACCESS_QRCODE';
                } elseif ($_GET['u'] == 'e') {
                    $source = 'ACCESS_EMAIL';
                } elseif($_GET['u'] == 's') {
                    $source = 'ACCESS_SMS';
                } else {
                    $source = 'ACCESS_UNKNOWN_SOURCE';
                }
            }
            // 
            $tx = DB::Fetch('SELECT tx_id, origin_tx_id, company_uuid, created_on FROM rkyc_allowlist_items WHERE short_id = ? ORDER BY created_on DESC', array($short));
            $uri = 'https://kyc.intergreatme.com/za/?whitelistId='.$tx['tx_id'].'&originTxId='.$tx['origin_tx_id'];
            header('Location: '.$uri);
            echo 'If the server did not redirect you, please click here: '.$uri;
            exit;
        } catch(Exception $ex) {
            error_log($ex->getMessage(), 0);
        }
    } else {
        header('Location: ../');
    }
?>

The short_id is a unique field and does not allow duplicates, which could cause an issue if two people get the same short_id and both of their transactions are still active. The system would not know who's transaction to redirect to.

I currently use PHP's uniqid() function to create a somewhat unique identifier and use a unique constraint on the database when attempting to insert the store_id.

I adopted this function because I understand the potential limitations of using this function and recognise there won't be many of these identifiers being created in this system. If you're unsure, check the PHP manual to see if using uniqid() is suitable for your purposes.

Our dehydration strategy therefore involves creating a unique identifier that acts as the key to the corresponding value. Depending on how flexible your strategy needs to be will ultimately impact your design methodology.

In my particular use case, I have opted to store the tx_id and origin_tx_id in separate columns and dynamically build the redirect URL at runtime. I could just have easily stored the entire encoded URI as the value instead, but this would hamper any long term modifications i.e.: if part of the URL needs to change, I need to re-write the database columns if this was a simple key-value pairing.

So my table has a short_id, origin_tx_id, and tx_id along with a created_on timestamptz column. I always add a created_on timestamp column as I find it useful to know when a particular transaction was created. I tend to prefer an insert-only approach when interacting with databases vs. using an update strategy, mostly because it is faster to insert a record than update one, but also tables become self-audited. I can see the list of actions that were taken by simply looking at the inserts, and I use the created_on to access the latest item. And for warehousing, I can just select the entries by date and move them to the data warehouse.

From here, the process is pretty straightforward. When the user navigates to the short link URL, I do a database query based on the path parameter, retrieve the relevant information and dynamically build target URL.

I then use headers to redirect the user to the new location. I generally add some additional meta data to the URL based on the use case. For example, I'll add some get parameters: ?u=s for SMS, ?u=e for email, and ?u=q for QR Code. This allows me to determine how the user engaged with the platform. And because it maps back to the tx_id and origin_tx_id of the user, I also know which user performed the action.

Using a short link process gives us some obvious benefits:

It gives us the ability to reduce the length of a URL.

We're able to track the number of times an external resource is accessed via our shortened link.

It is easier to alter the location of the resource without impacting the user. This is probably one of the more useful benefits of using short links.