Guy Royse

Guy Royse

Finding Bigfoot with Async Generators + TypeScript

Lately, I’ve been messing about with generators—of the synchronous and asynchronous varieties—using TypeScript. They’re not something I’ve used much and I thought it’d be a good idea to get a little more well acquainted with them. And, of course, I like to share what I learn. So, let’s commence with the sharing.

Generators in TypeScript

Generators are special functions that generate a sequence of values and return iterators. For stand-alone functions, you define them by putting a * immediately after the function keyword. For functions in a class, including static ones, you put it right before the function name itself.

function* someNumbers(): Generator<number> {...}

class NumberGenerators {
  static *someNumbers(): Generator<number> {...}
}

Generators then return data using the yield keyword.

function* someNumbers(): Generator<number> {
  yield 1
  yield 2
  yield 3
}

You can then access these values just like any iterator by looping calls to .next() or by using a for...of loop.

const generator = someNumbers()

while (true) {
  const { value, done } = generator.next()
  if (done) break
  console.log(value)
}

for (const value of someNumbers()) {
  console.log(value)
}

Now, this might not sound all that interesting as, after all, you could do this by simply returning an array. However, the magic is in that yield keyword. A generator isn’t actually executed until you—or your for...of loop—calls .next(). Once you—or it—does, the code runs right up to the yield statement, returns the value, and then pauses the execution until the next call to .next().

Generators don’t have to end. They can just keep on going forever. For example, you could create a generator that returns numbers from 1 to infinity and just call .next() until you’re sick of it. Or, you can use a for...of to create an infinite loop.

function* allNumbers(): Generator<number> {
  let i = 0
  while (true) yield i++
}

for (const value of allNumbers()) {
  console.log(value)
}

Asynchronous Generators

Generators can also be asynchronous. This means that instead of yielding values they yield Promises. To make an asynchronous generator just mark your generator functions as async and yield Promises.

async function* allAsyncNumbers(): AsyncGenerator<number> {
  let i = 0
  while (true) yield Promise.resolve(i++)
}

To consume them, you can either call .next() and await the Promise or use a for await...of loop and not think about promises. Personally, I’m a fan of the latter.

const generator = allAsyncNumbers()
while (true) {
  const { value, done } = await generator.next()
  if (done) break
  console.log(value)
}

for await (const value of allAsyncNumbers()) {
  console.log(value)
}

Doing Something Allegedly Useful

Of course, these examples are just toys. A more proper use for asynchronous generators is handling things like reading files, accessing network services, and calling slow running things like AI models. So, I’m going to use an asynchronous generator to access a networked service. That service is Redis and we’ll be using Node Redis and Redis Query Engine to find Bigfoot.

I’m not gonna get into the details on connecting to Redis or on how to create a schema for Redis Query Engine. There’s plenty out there about that already, some of it created by me. And, I have a repo with all the details anyhow.

However, this is TypeScript so we are gonna start out by defining some types. First, the BigfootSighting type. This type matches the JSON that we are getting out of Redis. It’s just a bunch of carefully arranged strings.

type BigfootSighting = {
  id: string
  title: string
  account: string
  classification: string
  location: {
    county: string
    state: string
    lnglat: string
  }
}

The generator itself takes a Redis query, which is just a string and, of course, returns the generator.

Inside the generator, we start a loop that calls .ft.search() until there are no more results. As each result has multiple JSON documents—ahem—I mean Bigfoot sightings. Totally Bigfoot sightings. I cast it and everything.

As each result has multiple Bigfoot sightings, we need to loop over those too, yielding them as we go.

async function* fetchBigfootSightings(query: string): AsyncGenerator<BigfootSighting> {
  let offset = 0
  let hasMore = true

  while (hasMore) {
    /* Get a page of data. */
    const options: SearchOptions = {
      LIMIT: { from: offset, size: PAGE_SIZE },
      DIALECT: 4 // The latest dialect. Supports cool stuff like vector search.
    }

    const result = await redis.ft.search(INDEX_NAME, query, options)

    /* Loop over the resulting documents and yield them. */
    for (const document of result.documents) {
      /*
        There's only one value in the document and technically it's in a
        property named '0' but this looks better.
      */
      yield document.value[0] as BigfootSighting
    }

    /* Prepare for the next page. */
    hasMore = result.total > offset
    offset += PAGE_SIZE
  }
}

Remember, the code pauses execution after every yield. So, we won’t make another network call until after we’ve consumed the first page of sightings. This is great, because if our code decides to not consume all the results, say by calling break in our for await...of loop or just not calling .next() again, we don’t have to make another network call. Less is more.

Another nice perk here is memory efficiency. Since we’re yielding one sighting at a time and waiting between calls, we’re not slurping the entire dataset into memory all at once. That means if there are thousands of Bigfoot sightings—and you know there are—we’re only dealing with them as needed. It’s lazy in the best possible way.

Wrapping Up

So, that’s generators. Let’s wrap up by wrapping up some calls to this generator to execute “meaningful” queries for your application. Here’s a few that will help you find Bigfoot.

function fetchAll(): AsyncGenerator<BigfootSighting> {
  return fetchBigfootSightings('*')
}

function fetchByKeywords(keywords: string): AsyncGenerator<BigfootSighting> {
  return fetchBigfootSightings(keywords)
}

function fetchByClassification(classification: string): AsyncGenerator<BigfootSighting> {
  return fetchBigfootSightings(`@classification:{${classification}}`)
}

function fetchByState(state: string): AsyncGenerator<BigfootSighting> {
  return fetchBigfootSightings(`@state:${state}`)
}

function fetchByCountyAndState(county: string, state: string): AsyncGenerator<BigfootSighting> {
  return fetchBigfootSightings(`@county:${county} @state:${state}`)
}

function fetchByLocation(longitude: number, latitude: number, radiusInMiles: number): AsyncGenerator<BigfootSighting> {
  return fetchBigfootSightings(`@lnglat:[${longitude} ${latitude} ${radiusInMiles} mi]`)
}

Happy hunting!

Renaissance Guy's House Rules

I created a tidy little one-pager of the house rules I use when I run 5th Edition Dungeons & Dragons. They’re mostly cribbed from Index Card RPG by Runehammer Games—which you should totally go out and buy—but I included one from Sly Flourish as well.

Download it here.

Namelings, Namespaces, Nicknames, and Aliases

I like words. Old words. New words. Obscure words. And, most interestingly, forgotten words. I’m not the only one who likes this sort of stuff as several years ago I found this site called the Compendium of Lost Words.

One word I learned from it struck me as useful. That word: nameling. A nameling is someone with whom you share a name. My name is Guy and I don’t encounter namelings very often as my name is, shall we say, uncommon. However, you might have a more common name like Bill or George (anything but Sue) and know several namelings. Regardless, now that we have a word for it, we can talk easily about the idea.

And, we can extend the idea of a nameling to other things. Things like software. Namelings occur all the time in software. We’ll be creating some module for our project and realize that its name conflicts with an existing module from another library. Or maybe we’ll have two libraries that have conflicting module names. Those modules are namelings.

We typically solve this problem using namespaces and aliases.

Namespaces provide a container, or space, for names to exist inside of to alleviate the conflict caused by the namelings. Among humans, this is the purpose served by surnames.

Aliases allow us to give namelings another name in a particular context. They are nicknames for the namelings that we use when we need to work with namelings at the same time and find using the namespace burdensome. Among humans, we use nicknames to clearly talk to, with, and about namelings.

I propose we change how we talk about our code in this regard. We should use these older words instead of inventing new one. So, we can talk about it like this:

Namespaces are used to resolve namelings. Nicknames are given to namelings when both are in the code together and we don’t want to use namespaces.

This is way more fun than naming conflicts and aliases!

Northern Meetups

I’ve got a couple of upcoming meetups in some of the more northerly states in the next week or so. Specifically, Michigan and Minnesota.

I’ll be giving An Introduction to WebAssembly in Ann Arbor, Michigan on Monday, December 10th at Southeast Michigan JavaScript. WebAssembly allows you to write front-end web code in languages other than JavaScript by creating a virtual machine that runs in the browser. It’s really neat stuff and I’ll be diving into the low-level details. And, I’ll be live-coding in WebAssembly Text Format so be prepared for epic failure!

On Wednesday, December 19th I’ll be presenting what is one of my favorite talks: Deep Learning like a Viking. It’s a talk about Vikings, Keras, and Convolutional Neural Networks. And how to combine these three amazing things into an application that recognizes hand-written runes from the Younger Futhark! The talk will be hosted by JavaScript MN in Minneapolis, Minnesota.

So, if you come from the land of the ice and snow, drop by and say hi!

Machine Learning for Developers: Lies, Truth, and Business Logic

When I first heard about machine learning, my reaction was pretty much “meh”. I didn’t care. I didn’t see it affecting me or my job all that much. I was busy writing software. Software which was primarily focused on pulling data from some remote source, applying rules to that data, and putting it on a screen. Machine learning wasn’t going to change that.

Actually, that’s a bald-faced lie. I was terrified. There was this new technology out there that I had to master or I would be “left behind”. And this time, it wasn’t just a new programming language or JavaScript framework. It was something completely different. I had no idea how it was going to affect the software I wrote, but it was going to be bad.

Well, it turns out, I had it all backward. My bald-faced lie wasn’t a lie. Machine learning fit really well into what I was already doing and, while certainly different, it wasn’t shockingly so and didn’t justify my terror. Let me explain with what may seem like a digression about “business logic”.

Business logic is that bit of code that sits between the “presentation” (i.e. what the user sees) and the “data” (i.e. the information we have). It’s a sort of two-way adapter that takes the data and presents it to the user in a meaningful way and takes meaningful input from the user and saves it in a technical way.

For a simple application, there is often little to be had in the way of business logic. The data matches what the user wants to see and so these applications focus on just putting data on a screen (or perhaps a piece of paper). They tended to be easy to write and maintain because there’s no significant logic to be had.

Of course, eventually, you need a bit more. The user enters a particular value–notify them of a particular thing. The data has some special value–display some special thing. Rules like these are the business logic of the application. They start out simple and, for this reason, are often mistakenly put in the presentation or the data layers out of expediency or inexperience. But they rapidly get quite complex and you end up with a steaming plate of spaghetti code. Solution: give them their own layer.

But that business logic layer itself can get quite complex as rules grow and expand. I spent a fair bit of my career working for an insurance company where I saw this firsthand. If the state is Ohio and the county is Cuyahoga and the EPA check of the vehicle is no older than 90 days, do one thing. But if the county is Franklin or Cuyahoga (but not any other counties) and the EPA check is no older than 60 days do some other thing. Craziness! Code like this can swiftly spiral out of control into a marinara covered pile of noodles.

Often, the solution to this problem is a rules engine. Instead of writing a deeply nested set of hard to understand conditions, you define all your rules in an external piece of software and use that software to execute your rules. Rules engines are optimized for managing these rules and can even expose them to the business itself instead of just the developers. But sometimes even rules engines become difficult to manage and it becomes hard to understand how the rules are interacting within it. Eventually, instead of spaghetti code, you end up with a heaping portion of spaghetti rules with a side of meatballs.

At this point, there is an important realization to make. All of these approaches fall down in the face of excessive complexity. They have differing thresholds, to be sure. But, with enough complexity, they all become unmanageable. Once you’ve implemented a rules engine have you’ve hit the end of the line?

Oh. Hello there, machine learning.

Machine learning is like a rules engine on steroids. It allows us to create rules that encapsulate complex patterns that would otherwise be nigh impossible. But instead of us using it to define our rules, it finds the rules and then encodes them for us. All we have to provide it are examples and correct answers (i.e. features and labels) and it will create an abstraction we can use to exercise those rules (i.e. a model).

That’s a pretty neat trick!

Does that mean models should replace all business logic? Of course not. Rules engines didn’t replace all the business logic we coded. It augmented it. Sometimes a simple conditional in our code works just fine. And sometimes business logic is better managed with a rules engine. It’s not a question of code vs. rules engines vs. machine learning. It’s a menu from which we pick what we need. The business logic of our application, that layer between our data and our users, can be made of many things: simple rules in code, rules engines, and now machine learning models.

Machine learning, it turns out, doesn’t change what I’m doing. I’m still writing software which is primarily focused on pulling data from some remote source, applying rules to that data, and putting it on a screen. It’s just that we found a new way to encapsulate rules that before were too complex for us to manage or, in some cases, even define.

And that’s not scary. That’s empowering!

This post originally appeared on DataRobot.com.