• Jun
  • 25
  • 2016

Node.js framework for filtering, sorting and mashing data together from multiple sources in a series of asynchronous funnels

by Andrew Kandels

Funneler is an open source Node.js framework to mash together data from multiple sources asynchronously. Here's a quick example of filtering and mashing together users from two data sources and viewing a single page of results:

var Funneler = require('funneler');
 
var example = new Funneler([
    // custom plugin:
    {
        // non-command ($) indexes are stored as configuration options for plugins:
        results_per_page: 3,
        page: 2,
 
        // gather unique identifiers from databases, web services, etc.
        $map() {
            this.emit(_.range(1, 50));
        }
    },
 
    // plugins inherit common behaviors like pagination of the results which 
    // slices your identifiers to one page:
    require('funneler/lib/pagination'),
 
    // gather a page's worth of data from a database:
    {
        // gather data from one source in batches of 25 documents
        $data: [ 25, function(identifiers) {
            return new Promise((resolve, reject) => {
                User.find({ "userNumber": { $in: identifiers } }).exec()
                .then(result => {
                    result.forEach(item => this.data(item._id, item));
                    resolve();
                });
            });
        } ]
    },
 
    // and mash/join it together by unique identifier from another data source:
    {
        $data(id) {
            this.data(id, 'title', 'Item #' + id);
        }
    }
]);
 
example.exec().then(data => {
    console.log(data);
});

Results in:

[
    {
        userNumber: 4,
        firstName: "Steve",
        lastName: "Newman",
        title: "Item #4"
    },
    {
        userNumber: 5,
        firstName: "Sally",
        lastName: "Baker",
        title: "Item #5"
    },
    {
        userNumber: 6,
        firstName: "Al",
        lastName: "Rivers",
        title: "Item #6"
    },
]

Command steps

Funneler accepts a variety of commonly known terms from the data/ETL world to filter, transform and translate data through a series of passes, either synchronous or asynchronous: $map, $reduce, $sortData, $sort and $data. The commands heavily make heavy use of promises to delegate processing and make asynchronous development easier.

Use cases

Funneler can be used to drive lists of data. Imagine your customer has a list of items like blogs, clients or products. Funneler can gather the proprietary information from your application and mash it up with data from a variety of other sources, such as Google maps, government census data, web searches, APIs, or just about anywhere. Where a database restricts you to joins between tables, Funneler can join two pieces of data by any common key.

Problems working with lists

When building lists, first comes filtering and sorting. Once this is done, you often gather the top results and display a single page. Filtering and sorting operations have to be applied to every row -- but the gather step might only touch a handful of rows and often displays additional information not part of the sort or filter.

A common mistake when working with a large relational database is to try to build your filtering and sorting and your gather all in one query -- let the database do the work. The problem is you're potentionally gathering enormous amounts of data but only displaying a tiny portion. This often pushes temporary tables into memory and slows down performance.

Funneler can solve this problem by working with light data sets that represent only what it needs to filter data down and to sort that data. A series of follow up steps can then join/mash only the data you'll be displaying with external sources. This means potentionally joining 25 rows against expensive APIs or third party data sets at a time rather than your whole data set.

Additionally, Funneler can bridge the gap between databases. With microservices and APIs, your application's data might be split between in-memory (Memcached, Redis, etc.), relational (MySQL), object (MongoDB) or services/APIs (json). Funneler can take a single data item and merge variations from all these sources into a single document for display or visualization.

Using Funneler

Check out Funneler on Github and let me know what you think. I've already built a pagination plugin and intent to write additionally plugins and examples for working with some common public data sets. Ideally, I'd love for Funneler to be the go-to for driving lists of information in my Node.js projects. Let me know what you think, and feel free to contribute any PRs or feature ideas you might have as well.