--- url: /blog/post/2020-03-01-actionhero-developer-survey-results.md description: >- Thank you to everyone who took out 2020 developer survey — Here are the results. Also, the launch of Actionhero Office Hours! --- Thank you to everyone who took out 2020 developer survey! ![](/images/medium-export/1__xttMYnX__HSXA2VFjXiS__Tw.png) ### Introduction 2019 was a great year for the Actionhero Project! We passed 2,000 Github stars 🌟, moved from Javascript to Typescript, and you have downloaded `actionhero` from the [npm, Inc.](https://medium.com/u/b06982b22bf4) repository more than 300,000 times! Thank you. Now it’s time to look to the future. As the core maintainer, this survey has been very helpful to find our community’s pain-points and obtain structured feedback. 36 of you responded, and in this post, I want to analyze the survey results and discuss common themes, and share my plans for the future. *A quick note on the survey methodology & bias: The survey link was posted on the Actionhero website, NPM page, Slack Team, and Github page for ~5 weeks. There’s a strong bias in these results toward folks that already use Actionhero regularly, as they’ll be visiting those pages the most often. The survey was also only posted in English and run via Google Forms, which may have influenced the types of folks who were able to respond.* ### Actionhero is doing Well! First and foremost… it seems that most Actionhero developers are satisfied! We received high marks on: "Developer Experience": ***4.1/5***, "Ease of Use": ***4.0/5*** and "Documentation": ***3.3/5***. ![](/images/medium-export/1__XhCAwanSmxdoVbp49BGNdg.png) ![](/images/medium-export/1__WdK0N1fNMx3HzJukFw7hOg.png) ![](/images/medium-export/1__8SgQaiayK7lLO5__1MfIPSw.png) Of these 3 areas, "Documentation" is the area we need to focus on the most. This matches some of the free-form feedback received as well (more on this below). ### The Community Needs to Grow The average Actionhero developer has been using Actionhero for over a year, and we don’t have many new users compared to "veterans". ![](/images/medium-export/1__f3D2LOupPYdNRtl__U6Ywug.png) Folks are either finding out about Actionhero via Coworkers or [GitHub](https://medium.com/u/8df3bf3c40ae), but not at conferences or social media. ![](/images/medium-export/1__R7av8k9GSCVDxf3WKr3DsQ.png) In the "*What would you like to see from Actionhero in 2020?*" free-response question a number of you wanted better examples and documentation… and hinted this might be a way to help attract and onboard new users. ### Plugins, Plugins, Plugins! The most popular plugin by far is `[ah-sequulize-plugin](https://github.com/actionhero/ah-sequelize-plugin)`, with `[ah-resque-ui](https://github.com/actionhero/ah-resque-ui)` in second place. Which makes sense… folks want to to connect to the most popular databases and inspect the status of their tasks. The Task system is one of the main reasons folks are choosing Actionhero. Many of you wanted more plugins to use in your projects. The most popular topics were: * Authentication * New servers/transports like gRPR, andHTTP/2 * New database connections, like MongoDB So if you were to ask me what new features should we add to Actionhero in 2020, I would put on my best Steve Balmer voice and respond with "Plugins, Plugins, Plugins"! ![](/images/medium-export/1__F2OpxoI__u8C3uiu9foh__qg.jpeg) I think that the core of Actionhero is in a great place now that we’ve moved to Typescript, so I don’t expect another change that drastic in 2020. I believe we should focus on adding new functionality via plugins. For example, I think we can support the [Serverless Framework](https://serverless.com/) via a plugin as a new "[Server](https://www.actionherojs.com/tutorials/servers)". I think we can make deployment "packs" for most of the popular platforms (AWS, GCE, etc) to help with the environment and more… also via plugins. Speaking of plugins, many of the free-response comments listed that how Actionhero boots was problematic, as it didn’t allow a plugin to "inject" settings, change the environment variables, or otherwise deal with something needed by the environment. Our previous `boot.js` tool wasn’t enough. This early piece of feedback was already used to drive the [v22 release](https://github.com/actionhero/actionhero/releases/tag/v22.0.0), moving to a more developer-managed runtime. ### Miscellaneous It looks like React and Angular are tied for popularity among the folks that use Actionhero, along with Vue. ![](/images/medium-export/1__rgH0J2fVppMtQ4kQDBDoIw.png) Most people don’t use ***Chat*** while Most people do use ***Tasks***. ![](/images/medium-export/1__3s7UQLTVNkis66yv4WI68Q.png) ![](/images/medium-export/1__zV5HloyJ1QaF5H6OXyRIcg.png) Actionhero is deployed in *so* *many* different ways. ![](/images/medium-export/1__MrPfSL7qVqoof7byUeATyg.png) Our community is almost ~50/50 on Typescript. ![](/images/medium-export/1__Q0qAI5nochXDZnQZrsivhw.png) And finally, the majority of Actionhero users are in the USA, and speak English when developing Actionhero. Germany and the Ukraine (with their respective languages) are #2 and #3 in popularity. 🇺🇸🇩🇪🇺🇦 ### Next Steps: "Actionhero Office Hours" Putting on my Product Manager hat, here’s my priority list of what to work on, in order based on this survey: 1. ✅ Change how actionhero boots to allow modification in `server.ts` to allow for local modifications, loading config into the env, etc. This was the core change in the [v22 release of Actionhero](https://github.com/actionhero/actionhero/releases/tag/v22.0.0). 2. ✅ Update the `[actionhero-tutorial](https://github.com/actionhero/actionhero-tutorial)` project to Typescript. Done! 3. (in progress) Update `[ah-resque-ui](https://github.com/actionhero/ah-resque-ui)` to work with v20+ of Actionhero… and perhaps include it in newly generated Actionhero projects? 4. Display the "[Tutorials](https://www.actionherojs.com/tutorials)" part of the website more prominently, as some folks didn’t know it existed. 5. Better explaining/handling the "shutdown" of the sever, and how to handle tasks (see: [https://blog.evantahler.com/production-node-applications-with-docker-3-devops-tips-for-shutting-down-properly-ed54f09f0a7f)](https://blog.evantahler.com/production-node-applications-with-docker-3-devops-tips-for-shutting-down-properly-ed54f09f0a7f%29~~). Better explain this forActionhero directly. 6. Translate the documentation & website to other languages. 7. Evangelize Actionhero at Conferences and online. 8. Make awesome plugins. I would love enable more members of the community to help with this work, so I am making an open commitment of my time to help you work on these features! I think we’ve made Actionhero easy to work on, as it uses so many of its own concepts internally, has a great test suite, and many dependancies. If you are new to programming, or new to Actionhero — I’m committing a few hours a week to help new developers learn how Actionhero works and to make it better. Let’s call these **Actionhero Office Hours** . Please join me on [slack.actionherojs.com](http://slack.actionherojs.com) to choose a feature to work on, and talk about ideas! I hope for these to be like Office Hours in a University — while we all have the same general topic (Actionhero!), it will be unstructured time to dive deep into a particular area, both with me and the other attendees. Within the first few minutes, we’ll collect the topics you come with, and divide up the time in the session between them. Hopefully I or the other attendees can help unblock you, or you might find a peer interested in the same topic to work with! After a few initial Office Hours, we can see which other times make the most sense for the community, and perhaps do 2-per-week. We can work on Pull Requests, new plugins… or even that Actionhero-centric tech talk you are working on! *The first inaugural "Actionhero Office Hours" will be between 5–6PM (USA Pacific Time) on March 4th, and then repeat every Wednesday. The google hangout URL will be posted in our* [*slack.actionherojs.com*](http://slack.actionherojs.com) *and* [*Twitter*](https://twitter.com/actionherojs) *a few hours beforehand.* Thank you, and I look forward to seeing you at Office Hours 🎓! ![](/images/medium-export/1__U66BPxslmDSBi__iqGgtNEg.png) --- --- url: /blog/post/2011-11-02-a-blog.md description: 'Yep, I made a blog.' --- Yep, I made a blog. #### CHANGELOG * I started this blog on a self-hosted [WordPress](https://wordpress.org/) installation on [DreamHost](https://www.dreamhost.com/) in November of 2011 * In December of 2012 I moved the blog to [Jekyll](https://jekyllrb.com/) and used [Github Pages](https://pages.github.com/) for hosting * In April of 2015 I moved the whole thing to [Medium](https://medium.com/). * In July of 2021 I moved the whole thing to back to my own site as a next.js + markdown site. --- --- url: /blog/post/2016-05-28-a-memory-leak-in-node.md description: >- We recently found & solved a memory leak in ActionHero. If you use ActionHero to serve static assets, you should see a significant memory reduction! --- We recently found & solved a memory leak in ActionHero. If you use ActionHero to serve static assets, you should see a significant memory reduction with long-running servers. **Yay**! ![](/images/medium-export/1__OwDaBcU2IcB1SXCb7rfIsA.png) This leak was discovered by the [TaskRabbit](https://www.taskrabbit.com) team when one of their micro-services kept restarting every few days. TaskRabbit uses [monit](https://mmonit.com/monit/) to run all of their applications, and when an app uses too much RAM, it will HUP the app (preforming a graceful restart) and notify the team via PagerDuty: ```text # managed by ansible CHECK PROCESS actionhero-{{ application }} WITH PIDFILE /home/{{ deploy_user }}/www/{{ application }}/shared/pids/cluster_pidfile START PROGRAM "/bin/bash -c 'source /home/deploy/.profile && cd /home/{{ deploy_user }}/www/{{ application }}/current && HOME=/home/{{ deploy_user }} ./node_modules/.bin/actionhero start cluster --daemon --workers={{ actionhero_workers }}'" as uid {{ deploy_user }} with timeout 30 seconds STOP PROGRAM "/bin/bash -c 'kill `cat /home/{{ deploy_user }}/www/{{ application }}/shared/pids/cluster_pidfile`'" as uid {{ deploy_user }} if mem is greater than 600 MB for 5 cycles then exec "/bin/bash -c 'kill -s USR2 cat `cat /home/{{ deploy_user }}/www/{{ application }}/shared/pids/cluster_pidfile`'" if totalmemory is greater than 800 MB for 10 cycles then exec "/bin/bash -c 'kill -s USR2 cat `cat /home/{{ deploy_user }}/www/{{ application }}/shared/pids/cluster_pidfile`'" if mem is greater than 600 MB for 5 cycles then alert else if passed for 3 cycles then alert if totalmemory is greater than 800 MB for 10 cycles then alert else if passed for 3 cycles then alert if cpu is greater than 25% for 20 cycles then alert else if passed for 3 cycles then alert if totalcpu is greater than 90% for 10 cycles then alert else if passed for 3 cycles then alert if uptime < 1 minutes for 3 cycles then alert else if passed for 3 cycles then alert ``` * *Note that in Ansible, the \`{{ }}\` variables are interpolated, and won’t be there in the final file* So what was happening here? This application had previously never served any static files, and was an API endpoint (which runs [tr.co](http://tr.co/e)). Now, with the introduction of [Universal Links in IOS](https://developer.apple.com/library/ios/documentation/General/Conceptual/AppSearch/UniversalLinks.html), we are also serving up the typical apple-app-site-association file. To support multiple environments, we actually use an action to serve this file, and return the proper one: ```js exports.action = { name: "apple-app-site-association", description: "I return the wacky payload apple needs", run: function (api, data, next) { data.connection.sendFile("apple-app-site-association/" + api.env + ".json"); data.toRender = false; next(); }, }; ``` ```js exports.default = { routes: function (api) { return { all: [ { path: "/apple-app-site-association", action: "apple-app-site-association", }, ], }; }, }; ``` … and it works fine. However, the only change the application underwent which triggered the memory leak was this action… so there has to be something wrong here. After scouring the application itself with [node-debugger](http://www.actionherojs.com/docs/#debugging), nothing looked out-of-wack, but adding in a [load-test](https://www.npmjs.com/package/loadtest), the problem was certainly reproducible. From there I dove into actionhero’s core and sure enough, sending lots of files, either though the static server or \`data.connection.sendFile()\` caused a leak :( After digging around deep in ActionHero’s guts, I narrowed down the problem to the fact that there was an ever-growing number of HTTP connections (an internal object to ActionHero which represents the state of a request) that just never completed. Less than 1% of the connections have this problem, but over time, it would be enough to cause the leak. I decided to see if I could reproduce the problem from scratch, in a simple, 1-file node app… and here is the result: ```js var fs = require("fs"); var http = require("http"); var file = __dirname + "/index.html"; var connections = {}; var idCouner = 0; var port = 8080; var handleRequset = function (request, response) { idCouner++; var id = idCouner; connections[id] = { req: request, res: response }; response.on("finish", function () { delete connections[id]; }); fs.stat(file, function (error, stats) { if (error) { throw error; } response.writeHead(200, [["Content-Length", stats.size]]); var fileStream = fs.createReadStream(file); fileStream.on("open", function () { fileStream.pipe(response); }); fileStream.on("error", function (error) { console.log(error); // no errors are caught }); }); }; http.createServer(handleRequset).listen(port); console.log("server running on port " + port); setInterval(function () { console.log("connections: " + Object.keys(connections)); }, 5000); ``` If you were to run this server, and use the [loadtest](https://www.npmjs.com/package/loadtest) module against the server: ```bash npm install loadtest ./node_modules/.bin/loadtest -c 10 — rps 200 http://localhost:8080 ``` You would get a result that a small amount of connection objects are hanging around and never resolved: ```js > node server.js server running on port 8080 connections: connections: 2867 connections: 2867 connections: 2867 connections: 2867 connections: 2867 connections: 2867 connections: 2867 connections: 2867 connections: 2867 connections: 2867,12403 connections: 2867,12403 connections: 2867,12403 connections: 2867,12403 connections: 2867,12403 connections: 2867,12403 connections: 2867,12403 connections: 2867,12403 connections: 2867,12403 connections: 2867,12403 connections: 2867,12403,22350 connections: 2867,12403,22350 ``` … and there’s our leak! Now to figure out why. After struggling with this for a few days, I threw up my hands and decided this might be a bug in node.js’ core, and created an issue . Luckily, some helpful folks were able to point out my error. I had been testing for all sorts errors with the file stream… but I was forgetting about errors which might happen with the HTTP connection. If an http connection completes happily, it emits the \`finish\` event. However, if a response object is prematurely closed (perhaps the client disconnects, has networking trouble, etc), the response object will emit the \`close\` event instead… an event we were ignoring! This certainly correlates with the behavior we were seeing… a small number of requests might have trouble, disconnect… and ActionHero would never free that connection object. The relevant sections of ActionHero were fixed in [version 13.4.1](https://github.com/evantahler/actionhero/releases/tag/v13.4.1), and now our connection logic for sending files via HTTP looks like: ```js // the ‘finish’ event denotes a successful transfer connection.rawConnection.res.on("finish", function () { connection.destroy(); }); // the ‘close’ event denotes a failed transfer, but it is probably the client’s fault connection.rawConnection.res.on("close", function () { connection.destroy(); }); ``` --- --- url: /blog/post/2011-12-03-the-webkit-console-is-rendered.md description: >- I love doing web development in Chrome and Safari, as they have excellent built-in develop tools. There is one catch that I need to keep… --- I love doing web development in Chrome and Safari, as they have excellent built-in develop tools. There is one catch that I need to keep reminding myself about however that I thought I would list here: **The console is rendered just like the body is (sometimes)**. This important note may seem a bit cryptic, but it can best be explained by taking a look at what happens when you ask the Chrome console to *console.log()* the value of various data types. Imagine you had the following simple page: ```js var test_int = 0; var test_arr = []; var test_obj = {}; var sleep = 2; function log_and_increment() { if (test_int < 3) { test_int = test_int + 1; test_arr.push(test_int); test_obj["counter_" + test_int] = test_int; console.log("\r\n value of test integer: "); console.log(test_int); console.log("value of test array: "); console.log(test_arr); console.log("value of test object: "); console.log(test_obj); setTimeout("log_and_increment", sleep); } } window.onload = function () { console.log("Starting Tests"); log_and_increment(); }; ``` You can see that we will be looping through this script 3 times and outputting 3 variables: an integer, an array, and an object. Here are the results: ![](/images/medium-export/0__4Z9kHURSwJ720SdZ.jpeg) So what is going on here? If the console can render the variable directly (integer or string), it will render the variable as it was at the time of execution. If the variable is interpreted (array or object) then it will render the variable at it’s current state. If you can open the arrays in the console while he sleep is happening, you will notice them change over time. You would have expected the arrays to be: * \[1] * \[1,2] * \[1,2,3] but they were all \[1,2,3] I hope this clears up some confusion! --- --- url: /about.md description: >- About Evan Tahler — Head of Engineering at Arcade.dev, formerly Airbyte (via Grouparoo acquisition), Disney, TaskRabbit, ModCloth, Airbus. --- # About ![evan](/images/resume-4.png) Evan Tahler is the Head of Engineering at [Arcade.dev](https://www.arcade.dev), building the foundation for secure and scalable agentic tools. Prior to Arcade, Evan was the Director of Engineering of Sync Foundations at Airbyte, where he built and led the teams that focused on high-volume data movement and AI-pipelines. Evan was the CTO and co-founder of Grouparoo, the open-source reverse-ETL company, which was acquired by Airbyte. Evan's expertise lies in building the technical side of digital products, and growing the teams required to do so. He's helped companies like [Airbyte](https://airbyte.com), [Disney](https://www.disney.com), [TaskRabbit](https://www.taskrabbit.com), [ModCloth](https://www.modcloth.com), and [Airbus](https://www.airbus.com) launch new global digital initiatives, and has co-founded 3 startups. He is named on multiple patents focusing on authentication and digital entertainment. Evan is an open-source innovator, and frequent speaker at software development conferences focusing on AI, Product Management, Data Engineering, Node.JS, Typescript and DevOps. Evan holds a Masters in Entertainment Technology and BS in Mechanical Engineering from [Carnegie Mellon University](https://www.cmu.edu/homepage/creativity/2014/fall/the-greater-good.shtml). ## Companies --- --- url: /blog/post/2023-03-01-accelerating-alpha-connectors-to-airbyte-cloud.md description: Releasing 57 Connectors to Airbyte Cloud --- ![Connector release stages](/images/posts/2023-03-01-accelerating-alpha-connectors-to-airbyte-cloud/image.png) This month, we added **57 new Alpha connectors to Airbyte Cloud**, including CoinMarketCap, Omnisend, S3 Glue, SFTP Bulk, and Twilio Taskrouter. This marks the start of our program to accelerate the release of every eligible alpha connector to Airbyte Cloud as quickly as possible. This makes the free [Alpha and Beta connector program](https://airbyte.com/blog/why-airbyte-made-alpha-and-beta-connectors-free) even more valuable, and lets you signal to us which connectors we should focus on promoting to higher release stages based on your usage. In my [previous blog post](https://airbyte.com/blog/connector-release-stages), I shared how Airbyte manages our connector release stages. Alpha connectors are MVPs and Generally Available connectors are thoroughly-tested and receive our highest level of support. You only pay for syncs using Generally Available connectors. What wasn’t mentioned in the previous post was what criteria we use to release connectors to Airbyte Cloud. First, there are some connectors that will never be released to Airbyte Cloud, as they just won’t work in a cloud environment. This mostly includes the class of connectors which look at data on your local filesystem, reading files and such. Other than that, as long as the connector is safe and easy to configure, there should be no reason that you can’t try the connector, for free. The criteria for releasing a connector on Airbyte Cloud is:‍ 1. The connector is “cloud appropriate” as described above 2. The connector is tested with a sandbox account and passing the Connector Acceptance Test suite 3. The connector configuration properly denotes secrets within its configuration specification. 4. The connector provides its metadata (Icon and Documentation) 5. The connector only communicates via encrypted traffic (e.g. HTTPS or SSL). To streamline the process for deploying community-contributed connectors to Airbyte Cloud, we’ve created tools which are run every night to test connectors that aren’t yet on Airbyte Cloud and automatically create and merge PRs adding them. This should shrink the time it takes between a new connector contribution on our open source version and when you can start syncing on Airbyte Cloud to less than 24 hours. The complete and up-to-date list of connectors now available on Airbyte Cloud can be found [here](https://docs.airbyte.com/integrations). On behalf of the team, Thanks for using Airbyte, and keep those [connectors coming](https://airbyte.com/contributor-program)! --- --- url: /blog/post/2015-10-15-actionhero-bootstrap-angular.md description: >- We’ve had a simple ActionHero demo site for a while now, which shows how easy it is to get started with the framework, but until now, we’ve… --- ![](/images/medium-export/1__ZkuR4ORDLsAn7UzhbEDA1g.png) We’ve had a simple [ActionHero demo site](http://demo.actionherojs.com) for a while now, which shows how easy it is to get started with the framework, but until now, we’ve not had a "real world app" example… and it is time that we did! This project shows off how to use a number of more ‘advanced’ features, like session persistence and sharing between HTTP and Web Sockets on the same page, authenticating your actions, etc. * This example project can be viewed live at [angular.actionherojs.com](http://angular.actionherojs.com). * The code is on Github @ . * Visit [www.actionherojs.com](http://www.actionherojs.com) for more information [**evantahler/actionhero-angular-bootstrap-cors-csrf**](https://github.com/evantahler/actionhero-angular-bootstrap-cors-csrf) --- --- url: /blog/post/2016-12-14-actionhero-and-standardjs.md description: >- I don’t want to talk about ESLint ever again. I don’t ever want to talk about "code style" ever again. I want to write code, and I want the… --- *I don’t want to talk about ESLint ever again. I don’t ever want to talk about "code style" ever again. I want to write code, and I want the machine to ensure my "style" is correct. I don’t have time for this… I’ve got an app to write!* ![](/images/medium-export/1__QEsaQMJW6Ysa9HYNlV9XiA.png) [StandardJS](http://standardjs.com/) is a ridiculously opinionated collection of "very good" javascript (and JSX) code style rules. There is no configuration and you can’t change the rules. You are either all in or entirely out. **This is perfect**. I want a set of best practices to follow, set by industry "experts", which are automatically applied for me. [ActionHero](http://www.actionherojs.com) has been using a custom collection of `.eslint` rules for a long time now… but they were very *bespoke*. Github user [@synthmeat](https://github.com/evantahler/actionhero/commits/cf69eb692811cf66d94348d7026c916a7d2c8293/.eslintrc) did the hard work of creating our `.eslintrc` configuration back in the dark days of May 2016, and before that, we had `.jshint`. Having a linter is great, and it **really did** did help us catch bugs and have a consistent style. However, our consistent style was only based on one thing… my weird opinions. Synthmeat created a style guide to match what ActionHero already had. Over time, those opinions drifted further and further away from any known "best practices" and our `.eslint` file become more and more esoteric. This wasn’t great for on-boarding new users and growing the community. We recently refactored [ActionHero](http://www.actionherojs.com)’s internal code to use more ES6 features (arrow functions, \`let\` vs \`const\`, etc), and it was the perfect time for a refactor. **Awesome Standard.JS feature #1**: `standard — fix` This command actually modifies your JS files automatically and coerces them into a compliant format. I had budgeted 10 hours for this refactor… this brought it down to 2. You probably shouldn’t let a pice of code magically change your source without a good test suite to ensure that things still work… which we have! **Awesome Standard.JS feature #2**: `standard`. The standard package comes with a binary which tests your code. You don’t need to worry about including `eslint` (it is included automatically), `.eslint` files (they handle it), or anything else. This one binary just works. It works so well, you can set it as the `npm pretest` command and automatically enforce code quality as part of your test suite. **Awesome Standard.JS feature #3**: Integrations. There are wonderful plugins for [Sublime](https://packagecontrol.io/packages/SublimeLinter-contrib-standard), [Atom](https://atom.io/packages/linter-js-standard), [VSCode](https://github.com/shinnn/vscode-standard), [VIM](https://github.com/maxogden/standard-format), and [more](http://standardjs.com/index.html#text-editor-plugins) which make testing your code inline a breeze. ![](/images/medium-export/1__Zagfg25g7Z3qYV074TBqqw.gif) Thank you StandardJS! --- --- url: /blog/post/2016-04-24-actionhero-and-tessel-2.md description: I just received my Tessel 2 in the mail. --- I just received my [Tessel 2](http://tessel.io) in the mail. ![](/images/medium-export/1__j5aWtCqFUgXrKSRbG3YuAQ.jpeg) For those of you who don’t know, the Tessel is a small, low-power computer which is capable of running n[ode.js](http://nodejs.org) natively. This little micro-computer has all the bells and whistles of a first-class IoT device, including wifi, pins, uART support, etc…. BUT IT SPEAKS A HIGH-LEVEL LANGUAGE I ACTUALLY WANT TO USE! Needless to say, I’m pretty excited. After spending a [fair](https://medium.com/bricolage-evan-s-blog/node-js-running-on-a-phidgets-sbc2-board-5b188a6123af#.4ph1grtkp) [amount](https://medium.com/bricolage-evan-s-blog/pivotal-tracker-phidgets-and-nerf-guns-6b196a4254a0#.maf0ys2ww) of [time](https://medium.com/bricolage-evan-s-blog/on-nodejs-and-phidgets-282a765aea7b#.fbkim1o7l) getting node.js to run on a Phidget board, the Tessel seems like the next iteration in an IoT device that might actually be accessible. After going though the tutorials and getting the [LED lights on the board to blink](http://tessel.github.io/t2-start/blinky.html) I wanted to move on to getting [ActionHero](http://www.actionherojs.com) running on the board so I could have background tasks and web-sockets. There were a few changes required to get this running: #### *Getting Started* First Connect to your Tessel2 and set it up! ```bash # Install the Tessel command package from NPM npm install -g t2-cli # Find and Rename your Tessel t2 list t2 rename TesselBot # Connect the Tessel to your WiFi t2 wifi -n -p # Allow your computer to talk to the Tessel t2 provision ``` Create a new ActionHero project just like you would for any use case ```bash mkdir actionhero cd actionhero npm install actionhero ./node_modules/.bin/actionhero generate ``` #### Working Directory The first issue I encountered was around how the Tessel2 execution path. Actionhero makes a lot of assumptions about how you start the process, so it can load up your local config files and actions, overriding what is in the core project. When the Tessel runs your code, it runs it from **/root** but your code resides in **/tmp/remote-script**. To solve this, we can create a simple wrapper script for your project. Create an **index.js** at the root of your project, and require ActionHero directly. Then, we can check if the **\_\_dirname** of this file is different than **process.cwd(),** and if it is, we can change it. From there, we can then boot the ActionHero server. ```js var ActionheroPrototype = require("actionhero").actionheroPrototype; var actionhero = new ActionheroPrototype(); process.env.PORT = 80; console.log("Starting up Tessel ActionHero Wrapper"); console.log(" local path: " + __dirname); console.log(" working path: " + process.cwd()); if (process.cwd() !== __dirname) { console.log(" changing working path to : " + __dirname); process.chdir(__dirname); } actionhero.start(function (err, api) { if (err) { throw err; } else { api.log("~ boot shim complete ~"); } }); ``` When this runs, you can see the change in the output: ```bash > t2 run — full index.js INFO Looking for your Tessel… INFO Connected to tesselBot. INFO Building project. INFO Writing project to RAM on tesselBot (19278.336 kB)… INFO Deployed. INFO Running index.js… Starting up Tessel ActionHero Wrapper local path: /tmp/remote-script working path: /root changing working path to : /tmp/remote-script 2016–04–24T18:50:54.228Z — notice: *** starting actionhero *** 2016–04–24T18:50:59.036Z — warning: running with fakeredis 2016–04–24T18:51:01.108Z — info: actionhero member 10.0.1.40 has joined the cluster 2016–04–24T18:51:05.405Z — notice: pid: 1491 2016–04–24T18:51:05.428Z — notice: server ID: 10.0.1.40 2016–04–24T18:51:11.432Z — notice: Starting server: `web` @ 0.0.0.0:8080 2016–04–24T18:51:22.957Z — notice: environment: development 2016–04–24T18:51:23.198Z — notice: *** Server Started *** 2016–04–24T18:51:23.203Z — info: ~ boot shim complete ~ ``` A nice thing about this boot file is that you actually set some tessel-specific overrides here, that won’t effect your local development when you run the normal **npm start**. For example, I want the web server, when running on the Tessel, to run on port 80 and set the **NODE\_ENV**. To to that I added the like ```js process.env.PORT = 80; process.env.NODE_ENV = "production"; ``` … which has the effect of overriding the default in the config, just like if I would have launched the server with that ENV flag. #### Web-Sockets I’ve noticed that pretty much all the websocket servers add significant boot time to the process, adding around ~10 minutes. The app does boot, but it’s painfully slow. I actually think this has to do with the magnification of the JS file onto disk, and not the server itself. I don’t need web-sockets for my project, so I disable the server… this makes things much faster. #### Run it ```bash # To try out your project while connected to the Tessel2 **t2 run --full index.js** # To install your project to the Tessel2 to it runs at boot **t2 push --full index.js** ``` You can reach your Tessel in a browser via **\.local** #### Observations * While running ActionHero on a Tessel2 is far slower than on a \*real\* computer, booting up in under a minute is a real impressive feat for such a tiny board. That’s quick enough to consider using it in the field. * File I/O is slow. If you are planning on serving/saving assets to/from disk, expect a fairly slow response time. I’m going to guess that this is one of the major contributors to a slow boot time. * Because file I/O is so slow, be sure you disable [development mode](http://www.actionherojs.com/docs/#development-mode) (possibly by setting the NODE\_ENV as pointed out above). It constantly polls the file system…. and you can’t really edit files on the Tessel itself anyway #### What’s Next? I’ve already got my Tessel2 tweeting a photo every minute (thanks [ActionHero tasks](http://www.actionherojs.com/docs/#tasks) and a very simple to [A/V API](https://github.com/tessel/tessel-av))… what can you come up with? I previously noted that the Tessel could not support binary modules like \`ws\`, but it can! --- --- url: /blog/post/2013-12-04-actionhero-v7.md description: A quick post to say that I’m very happy about where ActionHero is going. --- A quick post to say that I’m very happy about where [ActionHero](http://actionherojs.com) is going. The goals of the 7.x.x release have almost all been met! The target for the 8.x.x release is to finally build out a modular plug-in system for common "packs" of actions and initializers. This includes authentication, database connections, etc. To make the resque move, we created the [node-reque](https://github.com/taskrabbit/node-resque) package which other folks at [TaskRabbit](http://taskrabbit.com) have been helping with. [**taskrabbit/node-resque**](https://github.com/taskrabbit/node-resque) On the community side, we are no getting over 1K downloads a month from NPM! We also gained a number of [committers](https://github.com/evantahler/actionhero/network/members) to actionHero core over the last month. I’m considering doing an actionHero meet up (perhaps physical or virtual) in the near future. Any recommendations? --- --- url: /blog/post/2014-03-10-actionhero-and-newrelic.md description: >- There is now a plugin for ActionHero + New Relic if you ar using actionhero version 10 or later. --- *There is now a plugin for ActionHero + New Relic if you ar using actionhero version 10 or later.* [**evantahler/ah-newrelic-plugin**](https://github.com/evantahler/ah-newrelic-plugin) As [ActionHero](http://www.actionherojs.com) matures, integrating with production monitoring tools becomes a required feature. I recently discussed integrating with [airbrake](http://blog.evantahler.com/blog/airbrake_and_actionhero.html), and now it’s time to talk [newrelic](http://newrelic.com). For those of you that don’t know, NewRelic is a great tool to monitor the guts of you application in production, including tracing request durations, errors, etc. We use it at [work](http://www.taskrabbit.com). [NewRelic’s JS package](http://newrelic.com/nodejs) was really easy to work with, and all you need is the following initializer: 1. install the newrelic agent by adding `newrelic` to `package.json` 2. copy over the newrelic config file to the root of your app with `cp node*modules/newrelic/newrelic.js ./newrelic.js` ```js newrelic = require("newrelic"); exports.newrelic = function (api, next) { api.newrelic = {}; api.newrelic.middleware = function (connection, actionTemplate, next) { if (connection.type === "web") { // for now, the node newrelic agent only supports HTTP requests newrelic.setTransactionName(actionTemplate.name); } next(connection, true); }; api.newrelic.errorReporter = function (type, err, extraMessages, severity) { newrelic.noticeError(err); }; api.newrelic._start = function (api, next) { // load the newrelic middleware into actionhero api.actions.preProcessors.push(api.newrelic.middleware); // load the newrelic error reporter into actionhero api.exceptionHandlers.reporters.push(api.newrelic.errorReporter); // optional: ignore certain actions // newrelic.setIgnoreTransaction('actionName'); next(); }; api.newrelic._stop = function (api, next) { next(); }; next(); }; ``` I was kicked into action by [this issue on the newrelic JS project](https://github.com/newrelic/node-newrelic/issues/121). In your newrelic.js config file, there are a number of options you can configure, such as moving the log location, etc. [Here is the complete list of options](https://docs.newrelic.com/docs/nodejs/customizing-your-nodejs-config-file). *Originally published at 10 Mar 2014* --- --- url: /blog/post/2016-10-26-actionhero-nodejs-v7.md description: Does ActionHero work with with the newly released Node.js v7? --- Does [ActionHero](http://www.actionherojs.com/) work with with the newly released [Node.js](https://medium.com/u/96cd9a1fb56) v7? ![](/images/medium-export/1__1aBZOO8chVHmePbMxUTl0w.png) Of course it does! Also, ActionHero just got a new documentation website. [Check it out!](http://www.actionherojs.com/) ![](/images/medium-export/1__LQu6nx3EOUYtUj__LnrprUg.png) --- --- url: /blog/post/2016-05-16-actionhero-community-on-slack.md description: This week the Actionhero Community moved from Gitter to Slack --- This week the Actionhero Community moved from [Gitter](https://gitter.im) to [Slack](https://slack.com/). ![](/images/medium-export/1__myDepyzyDisjRxKGEbj6tA.png) The move was voted on by the community, and Slack won 4:1. Both tools are free for open source projects, both have web, mobile, and desktop clients, and offer deep integration with the various programming tools we use (GitHub, Travis-ci, etc). For me personally, I had a hard time choosing one over the other. I’ve been in contact with the Gitter team for some time, and they have been wonderful to work with (and I \*think\* they might use parts of Actionhero to power Gitter). I wanted to share our reasons for leaving on our first day off of their platform: > re: > > Hello Mike- > > You asked me to reach out and explain why the Actionhero community is moving from Gitter to Slack. > > Personally, it was a toss-up. I’m a been a happy member of a number of Slack channels/orgs for work, and I’ve also enjoyed my time with Gitter (since 2014!). I’ve had a number of conversations with your team, and you’ve all been very nice and helpful… and made a wonderful (free!) product. > > A few folks have been asking to move to Slack for some time, and if you check the last few days of our chat [\*in\* Gitter](https://gitter.im/evantahler/actionhero), I’ve pressed them to explain their reasons. Without any clear feature-differentiation, I think it really came down to "I already need to use Slack for work, I like it, and I don’t want to use more apps". A few folks wanted to have sub-channels (perhaps about a PR or topic), but I don’t know if out community is big enough to use it. However, we put it to a vote, and Slack won… so we are switching. > > The Socket.io team has a [good paradigm](http://rauchg.com/slackin/) for how to simplify the Slack signup flow for an open source project (which does requite you host a running login service) but they make it simple via Heroku. > > All of this said, I believe i’ve offered in the past to help with Gitter if I can, as I think you rely on many of the same technologies as Actionhero… and that offer still stands. After using Slack for the Actionhero community for a week, I’ve now found some \*real\* reasons why the platform is working better for the team. I wanted to share them here for other open-source projects which might be considering a similar thing. #### Channels * Channels for Bots: integrations can live \*in\* a channel, rather than in the main room. This is nice for cleaning up clutter, but still allowing anyone to view the status of the latest build. * Easy-to-crate feature channels to move conversations out of the main chat, but you can still link/reference specific messages back to the main room #### Integrations Slack simply has more integrations already. Want to add a bot that lets you give kudos to your team? Done. Easy. Want a bot to remind you to do something next week? It exist. ![](/images/medium-export/1__BKTSuf4weUBzI9kbAVi3mw.png) Yes, of course you can connect actionhero to slack as well. #### Open Source Signup One of the first hurdles was handling signup and registration for the team. This is an open source project, so we want to allow anyone to join without waiting for an administrator to have to approve anything. Luckily, the [socket.io](http://socket.io) team already handled this for us via [Slackin](http://rauchg.com/slackin/). Slackin is a little app you can run for free on Heroku that creates a simple sign-up page for you team. It take 2 configuration options and can be up-and-running in under 5 minutes. ![](/images/medium-export/1__4OtTLNR66ejWmlnUPIKe9A.png) Slackin also provides Gitter-like chat-badges for your team: ![](/images/medium-export/1__v9SZ7TrbRP9jnZlwh1LSxQ.png) #### IRC and Jabber Slack has bridges for both [IRC and Jabber](https://get.slack.help/hc/en-us/articles/201727913-Connecting-to-Slack-over-IRC-and-XMPP). Prefer to use those tools and still interact with the team on Slack? You can! Only a small amount of the Actionhero team wanted this feature, and I think the utility of these tools is dying off, but it’s nice to know that they are there. *Yes, having a wholly open-source stack for your chat is nice, but the "it-just-works" nature of both Slack and Gitter outweighs those concerns for me at this time (for an already open-source project).* #### Most Importantly: More Chatting In the week since we’ve moved to Slack, the number of chat messages has doubled. This is by far the most important metric. **The team is talking more.** I bet that this is a combination of 2 things: * Folks already have Slack open at work, and they can add the Actionhero channel easily * Slack’s notifications are stickier. While I was skeptical at first, the community was correct: **Utility really does increase by using a tool folks already have.** There are a few posts talking about how [Slack doesn’t work for large open-source *societies*](https://medium.freecodecamp.com/so-yeah-we-tried-slack-and-we-deeply-regretted-it-391bcc714c81#.kbxojwyny)*,* but for now, at Actionhero’s current scale, I think we have found our home. --- --- url: /blog/post/2019-09-30-actionhero-for-real-time-games.md description: Introducing to the Actionhero Illustrated Community Q&A! --- ### Introducing to the Actionhero Illustrated Community Q\&A! ![](/images/medium-export/1__fpDBDrYAMXpsSd7Ooykw6A.png) Welcome to the first edition of the `Actionhero Illustrated Community Q&A`! The `Actionhero Illustrated Community Q&A` is a new project whose goal is to capture some of the best questions and answers from the Actionhero Slack group and share them with the world. For those of you who don't know, Actionhero is a [Node.js](https://medium.com/u/96cd9a1fb56) server framework focused on mutli-transport reusability, real-time chat and gaming, background jobs, etc. Actionhero includes many of the components mature digital products need out of the box so you don't have to reinvent them yourself. You can learn more about Actionhero at [www.actionherojs.com](http://www.actionherojs.com) or Why [Actionhero is the Node.js server for when your project grows up](https://blog.evantahler.com/why-choose-actionhero-9a4b5caf4e62). Actionhero has a vibrant slack community, ***which is free for anyone to join at*** [***slack.actionherojs.com***](https://slack.actionherojs.com/). ![](/images/medium-export/1__HHq4A__bmqHmIcdK4zWkBgA.png) It is a great place to ask questions and learn from other Actionhero developers, and we have a wide range of active members who are just getting started, and those that have been with the project for years. That said, Slack is not a great repository for knowledge… it can be hard to search, and even harder to know which information is stale. Also… very few people every draw illustrations or wireframes! I’m of the opinion that even a bad drawing can go a long way to explaining a technical concept. I’m a terrible illustrator, but I do know how to use Omnigraffle and tools like it! My challenge is to use the simplest tools to annotate these conversations. Every week for the month of October, I’ll be posting another article in this series which is lifted directly from our Slack community. I’m making the bet that interesting or popular conversations in Slack represent the larger community of Actionhero developers. Enjoy! *** ### Actionhero for Real-time Games #### September 30th, 2019 [Source Conversation in Slack](https://actionherojs.slack.com/archives/C04EVSUSD/p1569850773078800) Today’s conversation is the one that sparked this whole series! Adrian Lukas Stein ([@AdrianLStein](https://twitter.com/AdrianLStein) on Twitter), new to the Actionhero Community, asks: > Hey guys, I hope this is the right channel for this :sweat\_smile: I’ve just stumbled upon actionherojs and I’m a bit confused whether it is possible to create a game with it that has real-time movement with multiple players. I’ve read the documentation and found the Actions I know from Nakama for example and of course the chat. I’m just building a prototype right now for like 50 players. My question is: What is the best way to transmit the realtime data, through the actions or would a repurposing of the chat give a better result performance-wise? Chad Robinson ([@codeandbiscuits](https://twitter.com/codeandbiscuits) on Twitter), a core-contributor to Actionhero gave a great and nuanced response: > I would personally try to leverage chat (or write a similar bespoke layer in the same vein). Mostly because if you actually want real-time updates you don’t want to be making REST calls, you almost certainly want WebSockets so the clients are all maintaining active connections to the server(s), and the server(s) can send messages to them any time, whether the client requests a response or not. (To broadcast your updates.) ![](/images/medium-export/1__6t6nYRSZIwZ__EjbUDLp7zw.png) > ActionHero’s chat mechanism isn’t really doing much that’s special — it’s most "special" characteristic is simply that it exists and you don’t have to build the same thing. A lot of sample chat apps and tutorials you’ll find on the Internet are built around a single server. They make no arrangements for scalability. ActionHero ships out of the box leveraging Redis pub/sub to broadcast messages across all nodes in your cluster. During development you don’t notice this — it just works, and that’s great. But when you go to scale you’ll be buying yourself a pizza for choosing this path because it will also "just work" at a stressful moment when everyone else is scratching their head on how to scale their apps > Behind the scenes, all the "chat" service is doing is subscribing to a Redis pub/sub topic. When you send a message, it gets published to that topic, and listeners get a copy. Chat also provides some "room join" mechanics so you can break these messages out into groups of interested people. So for example in your classic minecraft/HALO type scenario where you had 2–4 people playing a world together, you might make a "room" for them and join them to it. Anything sent to that room gets broadcast to the others — position updates, interactions with the world, etc. ![](/images/medium-export/1__i8s__PNZA81__ELzISwEoCJQ.png) > One thing to keep in mind is that JS is very slow by comparison to the latencies expected by a lot of games. It’ll work great for turn-based games or ones where interactions can be a little slow (puzzle solvers). But you’re going to hit barriers if you try to make an FPS with this. It’s not AH’s fault, it’s just the nature of all the work going on doing things like encoding and decoding JSON, context switching while resolving Promises/async calls, etc. > And finally, AH’s "endpoints" are modular. If you \[are] tired of JS decoding and Websocket overhead you could active the TCP raw protocol server, or add a protobuf one. Very few frameworks allow you to do this so it’s a big differentiator. Chad has build a number of [high-throughput applications](https://www.medialantern.com/) with Actionhero. He’s done a great job of explaining the high-level architecture of how you might use Actionhero to create a real-time game, and also explained some of the limitations of the framework — a balanced response. If you are building a realtime game, perhaps [Actionhero](http://www.actionherojs.com) is a good choice for your backend! --- --- url: /blog/post/2013-07-05-actionhero-tutorials.md description: >- I wrote a few tutorials for actionHero recently at the behest of the community: --- I wrote a few tutorials for actionHero recently at the behest of the community: ### ActionHero Tutorial ![](/images/medium-export/0__0SA9sIOJ5ezVC671.jpg) This is a project on github which shows of some of the features of actionHero, and how to make use of them. It includes an extensive step-by-step README, and functional sample code. **Check it out here:** [**https://github.com/evantahler/actionHero-tutorial**](https://github.com/evantahler/actionHero-tutorial) [**evantahler/actionhero-tutorial** \_actionhero-tutorial - An example actionhero project demonstrating many common features\_github.com](https://github.com/evantahler/actionHero-tutorial "https://github.com/evantahler/actionHero-tutorial")[](https://github.com/evantahler/actionHero-tutorial) ### Tic-Tac-Toe Example ![](/images/medium-export/0__mPuuYXX8pTWXnSmr.jpg) As actionHero was created to be a game server, although it has morphed into more than that today. To honor actionHero’s roots, this example shows how to make a TCP api server to play tic-tac-toe against an unbeatable API. **Check it out here:** *Originally published at 05 Jul 2013* --- --- url: /blog/post/2016-06-24-actionhero-v14-and-a-problem.md description: Actionhero is now at version 14.0.1! --- Actionhero is now at version 14.0.1! ![](/images/medium-export/1__xINPKaOBDM40qfeSgBXyVQ.png) The V14.0.0 release includes a new way to save and reuse formatters and validators for your actions, and also gives you greater control of your Redis connections. The V14.0.1 release fixes a bad initialization of the above… You can see the whole changelog here: ### Named Validators & Formatters Allows for action validators and formatters to use both named methods and direction functions. ```js exports.cacheTest = { name: 'cacheTest', description: 'I will test the internal cache functions of the API', outputExample: {}, inputs: { key: { required: true, formatter: [ function(s){ return String(s); }, 'api.formatter.uniqueKeyName' // <----------- HERE }, value: { required: true, formatter: function(s){ return String(s); }, validator: function(s){ if(s.length < 3){ return '`value` should be at least 3 letters long'; } else{ return true; } } }, }, run: function(api, data, next){ // ... } }; ``` And then you would define an initializer with your formatter: ```js "use strict"; module.exports = { initialize: function (api, next) { api.formatter = { uniqueKeyName: function (key) { return key + "-" + this.connection.id; }, }; next(); }, }; ``` ### Redis Client There are **so many** ways to configure redis these days… handling the config options for all of them (sentinel? cluster?) is a pain… so lets just let the users configure things directly. It will be so much simpler! ### This will be a breaking change * in config/redis.js, you now define the 3 redis connections you need explicitly rather than passing config options around: ```js var host = process.env.REDIS_HOST || "127.0.0.1"; var port = process.env.REDIS_PORT || 6379; var database = process.env.REDIS_DB || 0; exports["default"] = { redis: function (api) { var Redis = require("ioredis"); return { _toExpand: false, // create the redis clients client: Redis.createClient(port, host), subscriber: Redis.createClient(port, host), tasks: Redis.createClient(port, host), }; }, }; ``` * move api.config.redis.channel to api.config.general.channel * move api.config.redis. rpcTimeout to api.config.general. rpcTimeout * throughout the code, use api.config.redis.client rather than api.redis.client Quickly after releasing version 14.0.0 we realized that there was a problem with the new way we handled the redis config. Actionhero loads its configuration [recursively](https://github.com/evantahler/actionhero/blob/master/initializers/config.js#L165-L174). We do this so that you can reference config directives from one file inside another. If you attempt to reference something that isn’t yet defined, we’ll skip over the file in question, load the rest of the config and try again. Under the hood, that means that any individual file is potentially required and exported many times. This is fine when you are building up a hash object, but terrible if you are creating a new connection to redis at each run. A new actionhero project ended up creating 27 connections to redis. The good news is that now the redis configuration is now in user-space. It was a simple change to check if the redis connections exist already, and if they do, disconnect the old ones. This isn’t yet an ideal solution (as booting up will connect and disconnect a number of times), but it’s an improvement. ```js var host = process.env.REDIS_HOST || "127.0.0.1"; var port = process.env.REDIS_PORT || 6379; var database = process.env.REDIS_DB || 0; var password = process.env.REDIS_PASS || null; exports["default"] = { redis: function (api) { var Redis; var client; var subscriber; var tasks; // cleanup if we are rebooting or looing in config load if (api.config.redis) { if (api.config.redis.client) { api.config.redis.client.quit(); } if (api.config.redis.subscriber) { api.config.redis.subscriber.quit(); } if (api.config.redis.tasks) { api.config.redis.tasks.quit(); } } if ( process.env.FAKEREDIS === "false" || process.env.REDIS_HOST !== undefined ) { Redis = require("ioredis"); client = new Redis({ port: port, host: host, password: password, db: database, }); subscriber = new Redis({ port: port, host: host, password: password, db: database, }); tasks = new Redis({ port: port, host: host, password: password, db: database, }); } else { Redis = require("fakeredis"); client = Redis.createClient(port, host, { fast: true }); subscriber = Redis.createClient(port, host, { fast: true }); tasks = Redis.createClient(port, host, { fast: true }); } return { _toExpand: false, // create the redis clients client: client, subscriber: subscriber, tasks: tasks, }; }, }; ``` I’ll keep working on this… --- --- url: /blog/post/2016-08-07-actionhero-v15.md description: >- A quick note to say that ActionHero has reached version 15. You can read the full release notes here. --- ### ActionHero V15.0.0 A quick note to say that [ActionHero](http://www.actionherojs.com) has reached version 15. [You can read the full release notes here.](https://github.com/evantahler/actionhero/releases/tag/v15.0.0) We now have a robust task middleware system (including middleware that can modify task/resque enqueuing) and more sensible binary commands. ![](/images/medium-export/1__ferA1ehHF40hn5EDKPGiXw.png) I’m **very happy** to say that the majority of the work on this release has been done by community members, specifically GitHub users [l0oky](https://github.com/l0oky) and [gcoonrod](https://github.com/gcoonrod). ### ActionHero Upgrade Guide During this release we talked a lot about the release cadence of ActionHero. Now that we have a fair number of enterprises using ActionHero, we need to be more mindful about handling the upgrade process, and making it as painless as possible. We’ve been folowing [Semantic Versioning](http://semver.org) for a while now (since before v9.0.0), ensuring that any breaking changes result in a new major version increase. However… when there is that breaking version change, **how** do you upgrade your project? To that end, w[e’ve launched a new section of the AcitonHero documentation which includes release guides for each major verion.](http://www.actionherojs.com/docs/#upgrading-actionhero) ![](/images/medium-export/1__0QCxQAd2jc1P8__yHWNWIUQ.png) ### ActionHero Release Cadence We’ve discussed wether or not we should release major versions in many small bathes (leading to more, but smaller, breaking change releases) or if we should group collections of breaking changes into one release (creating bigger releases less often). Operating under the assumption that work will continue at the same rate, and we’ll be \*creating\* the same number of breaking changes, we’ve opted to stick with "smaller breaking releases more often". The main reason for the decision was that folks don’t want to have to wait for a new feature to land in the master branch and be released. ActionHero’s release strategies are now defined: * Follow semver (any breaking change means a major version change) * No more than one major release per month. * Do not slow down progress. Release Often (even if it breaks things). Yes, #2 and #3 contradict on purpose! Finding the right balance is a job for the community. I do expect this cadence to slow down as the product continues to mature, until we release ~1 breaking release every 6 months. I expect us to reach that point in about a year. One thing the AcitonHero community needs is a Backporter. Since we have agreed to release often, if there is a bugfix to the master branch, it is very likley that it won’t be easily translatable to an older branch. We are actively looking for a maintainer to own a major version that they are actively using.This will allow us to have a sort of "LTS" model for some older releases. [Please join us on our Slack channel if this interests you](http://slack.actionherojs.com). --- --- url: /blog/post/2016-10-28-actionhero-v5-1-2-security-release.md description: Today we released the first-ever security release for ActionHero. --- Today we released the [first-ever security release for ActionHero.](https://github.com/evantahler/actionhero/releases/tag/v15.1.2) Details can be found below: ![](/images/medium-export/1__HQAn3z05N__iwWd__Q__JEFlA.png) ### 404 Web Request with malicious file name Previously, the default error responder when a client asked for a static-file which was missing (404) returned the name the of that file ```js api.config.errors.fileNotFound = function (connection) { return connection.localize([ "That file is not found (%s)", connection.params.file, ]); }; ``` This is dangerous because a malicious actor could request a filename with an executable javascript tag and harm the requester. We now will no longer return the file name: ```js api.config.errors.fileNotFound = function (connection) { return connection.localize(["That file is not found"]); }; ``` ### Malicious callback provided when requesting an action via JSONp When requesting an action via JSONp, it was possible (though unlikely) that the `callback` string you were providing contained malicious javascript which would harm the requester. We will now sanitize the provided `callback` in the following way: ```js function callbackHtmlEscape(str) { return str .replace(/&/g, "&") .replace(/"/g, """) .replace(/'/g, "'") .replace(//g, ">") .replace(/\)/g, "") .replace(/\(/g, ""); } ``` This fix has been back-ported to: * ActionHero V14 @ [v14.0.12](https://github.com/evantahler/actionhero/releases/tag/v14.0.12) * ActionHero v13 @ [v13.4.5](https://github.com/evantahler/actionhero/releases/tag/v13.4.5) A huge thank you to [@submitteddenied](https://github.com/submitteddenied) is earned for reporting these issues and working to fix them. --- --- url: /blog/post/2017-10-07-actionhero-v18-async-await.md description: Today we’ve released ActionHero v18.0.0! --- Today we’ve released [ActionHero v18.0.0!](https://github.com/actionhero/actionhero/releases/tag/v18.0.0) ![](/images/medium-export/1__CLNbk1x5u__aC__tfhY5lR__w.png) [Version 18.0.0 of ActionHero](https://github.com/actionhero/actionhero/releases/tag/v18.0.0) marks the conclusion of months of work by the community to upgrade every part of ActionHero to use `async/await` Javascript Syntax. This will not only make developing with ActionHero easier, but also less error-prone, easier to read, and more stable. We can write less code to get more done, have simpler error handling, and more! With the newer versions of node, we also get access to real `class` methods, which make extending and sharing code much easier. If you are not familiar with `async/await` in javascript, there are a number of [excellent guides](https://hackernoon.com/6-reasons-why-javascripts-async-await-blows-promises-away-tutorial-c7ec10518dd9) out there, but this is my favorite: There are no more callbacks and no more promise chains. You use `try/catch` to deal with errors. You can use normal `for` and `while` loops to work on async methods. The world is so much more pleasant! Code is more readable, and bugs are far easier to find and test. ![](/images/medium-export/1__5FB5y2ZoQkn3TDrvz98zFg.jpeg) For example, here’s how an action testing `api.cache` changed from ActionHero v17 to v18: This is going to be so much better! ### Community Support and Breaking Changes *Because this release had so many breaking changes, we needed to handle this release differently than we have in the past.* The ActionHero Core Team had to make a hard decision with this release. This marks the first version we’ve released that ***does not*** **work with all active LTS versions of node.JS**. Until now, this had been our policy. However, We felt the gains in legibility, productivity, and debugging were so important that leaving ‘legacy’ users behind was the correct tradeoff. However, to continue to support ActionHero users on v17, we will break with our other policy of only supporting the "master" branch. We’ve cut a `v17` branch, and will continue to accept patches and updates to it until March of 2018. We will also port any security fixes from master back to v17. [Greg](https://github.com/gcoonrod) has also offered to create inline documentation for v17 as well (more on this later). We know that upgrading to v18 (and perhaps a new version of Node.js) will be the most difficult ActionHero migration to date, but I assure you it will be worth it! I’ve also discussed these thoughts on the first [Always bet on Node podcast](https://twitter.com/dshaw/status/909565638443708417) with [Dan Shaw](https://medium.com/u/7c861ae496fa) and [Mikeal](https://medium.com/u/70d203660866). ### What’s New The [**full Release Notes**](https://github.com/actionhero/actionhero/releases/tag/v18.0.0) for ActionHero v18 can be found on GitHub, and as usual, we’ve published our [**Upgrade Guide from the previous version**](https://docs.actionherojs.com/tutorial-upgrade-path.html). The new features I’m most proud of are: **Full Support for** `**async/await**` **programming in ActionHero.** Really. This is a big deal. Our test suite saved hundreds of lines of code. There is no more ‘callback hell’, and in general, everything is so much easier to understand. We don’t need any more [flow control](https://github.com/caolan/async) tools. We’ve also significantly increased our test coverage to include [plugins](https://docs.actionherojs.com/tutorial-plugins.html) and the [CLI commands](https://docs.actionherojs.com/tutorial-cli.html)… parts of ActionHero which, in the old callback style, were \*very\* hard to test. **Full documentation of all public classes and APIs.** I had not heard of [JSdoc](http://usejsdoc.org/) before this upgrade… But, just like when I learned about [standard.js](https://blog.evantahler.com/actionhero-standard-js-1be0e6b0f1d4), I now can’t imagine creating publicly-consumable projects without it. JSdoc allows you to comment your code (automatically in many cases) such that you can generate human-readable documentation from your source files. This means that as you write a new method or class, you can use the handy plugin for your code editor to [automagically generate documentation](https://atom.io/packages/jsdoc). We’ve set up ActionHero to automatically build **our new documentation site**, [**docs.actionherojs.com**](https://docs.actionherojs.com/) at the conclusion of every successful test run. This ensures that our documentation will aways be up-to-date now! A bonus feature of this work is that now you will always have the ability to generate the documentation for your local version of ActionHero. `cd ./node_modules/actionhero && npm run docs`. In the coming weeks, we’ll add support for multiple versions of ActionHero the documentation website, but having an offline version will ensure that you can get work done with ActionHero. **Requiring the** `**api**` **object.** We were able to simplify much of the ActionHero developer experience by stopping our convention of passing the `api` object into every `run`, `start`, and `stop` method. We’ve taken advantage of Node.js’ module require cache to simply allow you to `const {api} = require('actionhero')` wherever you need it. This now allows you to have access to the ActionHero api in **any file** in your project, not just actions and tasks. Try out ActionHero v18 today! `npm install actionhero && npx actionhero generate` --- --- url: /blog/post/2012-08-02-actionhero-v2.md description: 'That’s Right, the rumors were true: ActionHero is now @ V3!' --- ![](/images/medium-export/1__E2OG6YATcmZ__0IfziI3m2g.jpeg) Here’s what’s new: * WebSockets. You can use them over http or https, and they are now a first-class member of the actionHero toolkit. We use the socket.io library so you can gracefully fall back to flash or ajax. * Better Configuration Tools. You can now enable/disable any of the transports you want. There was also some major refactoring to make it easy to add more transports in the future. * A sweet new logo (see above). [Thanks Ali!](http://www.alispagnola.com/) * A better ChatRoom bus which can be shared by all types of persistent clients. You can telnet in and chat with your friend who is using web sockets on their phone. * General Bug Fixing This release also marks the point where I feel the API is mature enough that I feel comfortable doing some "PR" for it, and trying to get real people using it. This means that I’ll be even more attentive to bug fixes and feature requests, so send them my way! As always, check out and [GitHub](https://github.com/evantahler/actionHero) for the details. [**evantahler/actionhero** \_actionhero.js is a multi-transport nodejs API Server with integrated cluster capabilities and delayed tasks\_github.com](https://github.com/evantahler/actionHero "https://github.com/evantahler/actionHero")[](https://github.com/evantahler/actionHero) --- --- url: /blog/post/2013-06-06-actionhero-v4-1.md description: >- A quick post to say that a new version of actionHero is out tonight: v4.1.0 This update contains some breaking changes to the Tasks and… --- ![](/images/medium-export/1__E2OG6YATcmZ__0IfziI3m2g.jpeg) A quick post to say that a new version of [actionHero](http://actionherojs.com) is out tonight: v4.1.0 **This update contains some breaking changes to the Tasks and Sats APIs, but the fixes are minor** The goal of this update was to seriously overhaul the task system, finally write those nagging tests, and come up with a better way to debug problems as they arise. ### Tasks The task system is now prototypical, which allows both tasks and workers to be created as needed, and more importantly, destroyed when complete. The old method of using api.tasks.enqueue() often had a memory-leaking side effect. Now, with task = new api.task(), you can task.enqueue() or task.run() and allow normal action garbage collection to take care of it. This approach is also nice when using actionCluster and redis, as task objects can have their guts JSON’d to redis and rebuilt as they are accessed. Philosophically, all tasks also now have their attributes kept in a data hash, and pointers to their new unique IDs in there various global delayed local and processing queues. This means that even if a worker was to crash mid-task-move, the data about that task is always kept in the data hash for recovery AND we can make use of the atomic pushand pop operations queues/arrays have to ensure a task is really only worked off once. Oh, and because of this new prototypical task system, [I can finally test it :D](https://github.com/evantahler/actionHero/blob/master/test/core_tasks.js) When getting into the guts of the task system, It was clear that there were 4 possible types of task that needed to be supported: #### **run once / one server** * the normal way of doing something delayed, IE: send an email to user X in the background * these can also have a manually set runAt time, IE: send that email in 10 minutes #### **run once / all servers** * like the above, but run on **all** servers. An example might be a task to clear out static cache files on all of your web-servers or to reboot them * like the above, these can also be delayed, IE: clear the cache in 5 minutes #### **run periodically / one server** * periodic tasks are those with a frequency set above 0 in action hero. They will ignore any special runAt’s set, and will be enqueued again once they are done. * An good example here would be system maintenance again, or tasks which then inspect another data source. For example, at [TaskRabbit](http://taskrabbit.com) we have tasks that run all the time to check if there are any rabbits to notify about a task they might be qualified to complete. Often times there are not, but we check every-so-often to be sure. #### **run periodically / all servers** * the most demanding of all tasks :D ### Stats When diving into the myriad of task types, It also became clear that analyzing such a system can be done on two levels: locally and globally. I might care how many tasks have been run overall in the whole system, but I also might care only about tasks on server (or cluster worker #2). This lead to the realization that when keeping stats, they need to be done at both of those levels. Now, the API actions for api.stats.increment will increment both a global and local count, and api.stats.get require to provide which count you want, the global or local one. You can also now set a stats value directly with api.stats.set, but this will only modify the local value. An example of the new api.stats.getAll method (showing off both local and global data): ```text stats: { global: { tasks:tasksRun: 1006, actions:processedActions: 1323, webServer:numberOfWebRequests: 814, fileServer:filesSent: 104, chatRooom:roomMembers:defaultRoom: 602, chatRooom:messagesSent: 1221, webSockets:numberOfRequests: 6, webSockets:numberOfActiveWebClients: 1, cache:cachedObjects: 10, cache:totalCachedObjects: 10, chatRooom:roomMembers:undefined: 6 }, local: { redis:numberOfPeers: 1, tasks:tasksRun: 1006, actions:processedActions: 1324, webServer:numberOfWebRequests: 815, fileServer:filesSent: 104, chatRooom:roomMembers:defaultRoom: 602, chatRooom:messagesSent: 1221, webSockets:numberOfRequests: 6, webSockets:numberOfActiveWebClients: 1, cache:cachedObjects: 10, cache:totalCachedObjects: 10, chatRooom:roomMembers:undefined: 6 } } } ``` note that lots of new metrics have been added :D ### API [Also of note is the new API page on the wiki](https://github.com/evantahler/actionHero/wiki/API-Methods). This should be a handy one-stop shop for all the actionHero methods you should need to develop your app. Enjoy! *Originally published at 06 Jan 2013* --- --- url: /blog/post/2013-01-13-actionhero-v4-2-2.md description: >- This was a busy week for actionHero! We are now up to version 4.2.2. Note that this release again changed some of the internal API’s. --- This was a busy week for actionHero! We are now up to version 4.2.2. Note that this release again changed some of the internal API’s. As always, [you can check out the details here](https://github.com/evantahler/actionHero/wiki/API-Methods). This week also saw pull requests from 2 new contributors, and . It’s a very strange feeling knowing that (at least) these 2 folks care enough about this project to read into the guts and make fixes. I’m **very happy** they are using actionHero and making it better, but it’s strange that I have no idea what they are using it for, and more so that I know nothing about them. I’ve had popular open source projects before, but I rarely every received any pull requests. I probably should ask them about it :D ### v4.2.0 This release is another API-changing release which addresses a number of performance issues. Most notably to developers, most internal methods no longer require the api object to be passed back. This was creating a number of cicurlar references (which are bad). This release also changes the way that actions are processed (to be closer to how tasks work, with a dedicated ‘processor’ object). There is also a check on boot for tasks which remain in the delayed state but have no reference in the delayed queue. It’s possible for a task to end up in this state if the server was shutdown at the moment that task was being inspected. From versions.md: * circular refrences are bad… remove all functions that require api to be passed in (mainly the API object) * change initializer names to remove (init) * object-ize connections, append connection-type prototypes in the server setups * remove connection classes from utils * remove global ‘requires’ from the API object, put them in the intilizers that need them * remove the notion of ‘public’ from connection objects * server shutdown needs to clear its connections from the chatrooms * delayed tasks which are older than 1 min should be checked against the various queues to be sure exist * fix http message request so that all pending messages are returned * general project orginization ### v4.2.1 This release adds the ability to set the action.blockedConnectionTypes attribute on an action. This allows you to create actions only for "web" or "socket" clients. The default will remain that actions are available for all client types. Thanks to for the pull request and suggestion From versions.md: Allowing support to limit the connection.type for which an action if valid for. Define the array of action.blockedConnectionTypes = \[‘socket’, ‘webSocket’] for example to not allow access from TCP or webSocket clients. Not defining the array will allow all client types in. ### v4.2.2 This version is a bug fix to resolve a problem that was introduced in version 2.2.0 regarding form parsing for web clients. Thanks to for narrowing down the problem and submitting a pull request for it. --- --- url: /blog/post/2014-06-06-actionHero-v9-0-0.md description: V9.0.0 of ActionHero went live on 2014–06–24 --- #### V9.0.0 of ActionHero went live on 2014–06–24 ![](/images/medium-export/1__JenwKu6ssjKvGKTHmpbh1g.png) [ActionHero](http://actionherojs.com) is very close to the v9.0.0 release! I’m particularly proud of some of the mature features we have added, including a real REPL and RPC tools. Check out the impressive list of changes below. If you have any comments on the release, please [note them on the mailing list](https://groups.google.com/forum/#!topic/actionhero-js/-abze7uC944). Thanks! This release focuses on performance, the chat system, and developer tools. We have been listening to your thoughts in the mailing list and on GitHub, and hopefully this release clears up a lot of the confusion and pain points you have identified! ### Chat Re-Write In v9.0.0, the chat system has been gutted and re-written to provide your API with finer controls about chat rooms. Most importantly, you can now control which rooms connections are members of *directly*. Connections can continue opt to join and leave rooms on their own (assuming the authenticationPattern is met). #### No more "listening" to rooms; Clients can be in more than one room at a time * Clients can no longer "listen" or "silence" to rooms. You are either in a room or not.. but now you can be present in more than one room! * This is a change to the socket and websocket servers, the base connection object, and the client-facing JS library. * say now require a room name, IE: say myRoom Hello World over telnet and the socket server, or client.say(room, message, callback) in the websocket client * There are updates to the browser-facing actionHeroClient.js (and .min) to reflect these changes * The Client APIs for joining and leaving rooms is simplified simply to roomAdd room and roomLeave room #### When you set the authentication rules for a room, all clients already in that room will be re-checked and kicked out if needed * New methods for server-side control of rooms: * api.chatRoom.add(room, callback) * api.chatRoom.destroy(room, callback) * now connections will be notified of a room closing and be removed * api.chatRoom.exists(room, callback) * api.chatRoom.setAuthenticationPattern(room, key, value, callback) * as noted above, connections already in the room will be re-checked * api.chatRoom.roomStatus(room, callback) * api.chatRoom.authorize(connection, room, callback) * test if a connection *would* be allowed to enter a room * api.chatRoom.reAuthenticate(connectionId, callback) * check all a connection’s rooms, and remove any that aren’t currently authorized * api.chatRoom.addMember(connectionId, room, callback) * you can add a member by ID to your room * api.chatRoom.removeMember(connectionId, room, callback) * you can remove a member by ID to your room ### Primus In this release, we have removed our dependency on [faye](http://faye.jcoglan.com/) in favor of [Primus](https://github.com/primus/primus). We now use Primus in the websocket transport, and have moved all backend cluster-cluster communication to raw redis Pub/Sub. The Primus project allows you to choose from many webscoket backends, including ws, engine.io, socket.io, and more. A number of new options have been added to /config/servers/websocket.js to manage this. Check out the [Primus](https://github.com/primus/primus) project for more information. #### WARNING actionhero will no longer attempt to manage non-sticky client connections. This means if you have a multi-server actionhero deployment and you use long-polling in your websocket transport, you will need to ensure that your load balancer can enforce sticky connections, meaning every request from the client will hit the same actionhero node. Implementation Notes * There should be no functional changes to the Browser-facing actionheroClient.js, meaning the methods should all behave the same. However, there have been significant changes under the hood. * The Faye initializer has been removed. * in new actionhero projects, we will include the [ws](https://github.com/einaros/ws) package as the backend for Primius (so we can generate a working project), but you do not need to keep this package. * actionhero generate will no longer create the client-facing actionheroClient.js on generation. Rather, the server will re-generate these files on boot each time. This is done so you can make changes in /config/servers/webscoket.js and have them included into the client JS. 3 new config options help mange the creation of these files: ```js // you can pass a FQDN here, or function to be called / window object to be inspected clientUrl: 'window.location.origin', // Directory to render client-side JS. // Path should start with "/" and will be built starting from api.config..general.paths.public clientJsPath: 'javascript/', // the name of the client-side JS file to render. Both `.js` and `.min.js` versions will be created // do not include the file exension // set to `null` to not render the client-side JS on boot clientJsName: 'actionheroClient', ``` ### RPC To enable the new chat API above, a key feature was the ability to add connections to a room using "serverA"’a API, even though the connection in question might not be connected to "serverB". This required the creation of a robust Remote Procedure Call (RPC) to allow actionhero servers to communicate with each other. You can call an RPC to be called on all nodes you may have in your cluster or just a node which holds a specific connection. You can call RPC methods with the new api.redis.doCluster method. If you provide the optional callback, you will get the first response back (or a timeout error). RPC calls are invoked with api.redis.doCluster(method, args, connectionId, callback). For example, if you wanted all nodes to log a message, you would do: api.redis.doCluster(‘api.log’, \["hello from " + api.id]); If you wanted the node which holds connection abc123 to change their authorized status (perhaps because your room authentication relies on this), you would do: ```js api.connections.apply("abc123", "set", ["auth", true], function (err) { // do stuff }); ``` Two new options have been added to the config/redis.js config file to support this: ```text // Which channel to use on redis pub/sub for RPC communication channel: 'actionhero', // How long to wait for an RPC call before considering it a failure rpcTimeout: 5000, ``` #### WARNING RPC calls are authenticated against api.config.serverToken and communication happens over redis Pub/Sub. BE CAREFUL, as you can call *any* method within the API namespace on an actionhero server, including shutdown() and read *any* data on that node. ### Connections The new api.connections.apply(connectionId, method, args, callback) has been introduced. This allows any node in the cluster to modify a property of a connection, even one that isn’t located locally on this specific node. This uses the RPC tooling described above under the hood. ### Web Server Updates * actionhero’s web server can now accept the PATCH HTTP verb (thanks [@omichowdhury](https://github.com/omichowdhury)). This verb can also now be used in routes. * actionhero’s web server will now allow you to access the raw form variables (un sanitized by the actionProcessor). connection.rawConnection.params.body and connection.rawConnection.params.files are available within middleware and actions (Thanks [@omichowdhury](https://github.com/omichowdhury)) * adding a callback param will only convert the response to JSONp (application/javascript) if the header would have still been x/json * if the header isn’t application/json or application/javascript, the server will no longer attempt to JSON.stringify the connection.response. * This means you can manually create XML, Plain Text, etc responses as long as you also change the mime (IE: connection.rawConnection.responseHeaders.push(\[‘Content-Type’, ‘text/plain’]);) (thanks [@enginespot](https://github.com/enginespot)) * internally traded connectionHasHeader() for extractHeader() which will return the most recent header of a given name ### Middleware Priorities Thanks to [@innerdvations](https://github.com/innerdvations), you can now choose how to order the execution of your middleware (preProcessor and postProcessor, and connection callbacks). You should no longer push to those arrays (although your application won’t error). You should now use api.actions.addPreProcessor(function, priority) and api.actions.addPostProcessor(function, priority) for actions and api.connections.addCreateCallback(function, priority) and api.connections.addDestroyCallback(function, priority) for connections. The priority in all the above is optional, and if not provided, the new api.config.general.defaultProcessorPriority will be used (defaults to 100). ### Room Middleware Per a discussion on the [mailing list](https://groups.google.com/forum/#!topic/actionhero-js), we have removed any automatic messaging actionhero might do for the chatrooms in favor of another type of middleware, chat middleware! This middleware allows you to control the messages and actions taken when clients join or leave a chat room. This should not be used for authentication. As we do not want to block the ability for a connection to join a room (we already have authentication tools in place), Chat Middleware does not have a callback and is executed "in parallel" to the connection actually joining the room. This middleware can be used for announcing members joining and leaving to other members in the chat room or logging stats. Use api.chatRoom.addJoinCallback(function(connection, room)) to add a Join Callback, and use api.chatRoom.addLeaveCallback(function(connection, room) to handle connections leaving a room. You can optionally provide a priority to control the order of operations in the middleware. You can announce to everyone else in the room when a connection joins and leaves: ```js api.chatRoom.addJoinCallback(function (connection, room) { api.chatRoom.broadcast(connection, room, "I have entered the room"); }); api.chatRoom.addLeaveCallback(function (connection, room) { api.chatRoom.broadcast(connection, room, "I have left the room"); }); ``` ### REPL actionhero now has a REPL! This means you can ‘connect’ to a running instance of actionhero and manually call all the methods on the api namespace. This combined with the new RPC tools make this a powerful debugging and development tool. Running grunt console will load up a version of action hero in your terminal where you have access to the api object. This version of the server will boot, initialize, and start, but will skip booting any servers. The REPL will: * source NODE\_ENV properly to load the config * will connect to redis, and load any user-defined initializers * will load any plugins * will **not** boot any servers If you are familiar with rails, this is very similar to rails console ### Variable Error Message Many of you have asked for the ability to change the string error messages actionhero uses. Perhaps english isn’t your user’s language, or you want so say something custom. Either way, there’s a new config file just for this: config/errors.js. Each error message is represented by a synchronous function which should return a string. Some functions are passed variables (like the connection) so you can customize your message. ### Performance Over the past few months, a [great conversation](https://github.com/evantahler/actionhero/issues/366) has been happening on GitHub about actionhero speed & performance. This conversation has lead to a few small tweaks inside actionhero which have made a big difference. Most importantly, somewhere between v7.0.0. and v8.0.2 we changed the async-ness of the actionProcessor and cache system to rely on setImmediate rather than process.nextTick. This change made the seder less susceptible to crashing under heavy load, but cost us significantly in speed. This change was too costly and has since been reverted. Thank you to everyone who contributed to the conversation! The change to Priums not only allows for more flexibility in the websocket server, but in preliminary tests, preforms much better than faye. ### Breaking Changes A list of things to watch out for when upgrading to v9.0.0 form v8.x.x: * actionhero now requires node > v0.10.0 * The browser-facing JS (actionHeroClient) has been updated. You are required to use the new JS in v9.0.0 projects * Actionhero now requires a load balancer with sticky connections to use websockets + long polling. Actionhero will no longer support websocket long polling + the node cluster module. "real" websockets + the node cluste rmodule will continue to work. * The Redis config file /config/redis.js has new options. These new options are required for the RPC system. * We have removed /config/faye.js * For the updates to middleware order processing, a new config variable has been added to /config/api.js, defaultProcessorPriority : 100, * For the new error strings, thew new /config/errors.js is required. * A number of new options have been added to /config/servers/websocket.js to manage this. Check out the [Primus](https://github.com/primus/primus) project for more information. ### Misc * api.utils.hashMerge/ api.utils.isPlainObject have been updated to check provided hashes (or nested hashes) for a special key, \_toExpand. If this key is false, this object will not be expanded on merge, and copied over literally. * the actionProcessor will now also append connection.actionStatus if a connection processes an action * actionStatus can be an error string, like ‘missing\_params’, null, true, or false * a status of true mean the action ran, but there still may be a connection.error * this status is mostly used for setting HTTP error codes in the web server * Many new tests added for chat and RPC * Test cleanup overall for servers * Various small bug fixes and improvements * Various dependent packs updated to their latest versions * This includes updating [node-resque](https://github.com/taskrabbit/node-resque) to v0.8.0, which allows errors to be caught and suppressed via plugin *Originally published at 06 Jun 2014* --- --- url: /blog/post/2012-11-17-actionhero-client-turns-1.md description: >- actionhero-client is a nodeJS package to allow a remote nodeJS app to connect to and consume an actionHero API. --- ![](/images/medium-export/1__NDp0nEkCsIJw8kXYedA2pw.png) [actionhero-client](https://www.npmjs.com/package/actionhero-client) is a nodeJS package to allow a remote nodeJS app to connect to / consume an [actionHero API](http://actionherojs.com/). Improvements include: * TLS (secure) option in addition to raw TCP * Tests! * Cleaner API * More sensical event emitters The most intersting part of creating this client was dealing with the notion that you can have more than one ‘pending’ request out to the API at any moment. While node.js developers are used to to dealing with parallel and async functions, I still always thought of remote requests in the HTTP way, mainly that I could always assume each request would be responded to with only one and only one response. This isn’t true in raw socket land! If I were to request slowAction and then fastAction, it’s likely I would receive the response to fastAction first. actionHero makes use of a request counter (messageCount) for each request so that clients can differentiate responses, and know which call they were related to. I adopted the "holding pen pattern" (I made this term up just now) which stores async callbacks until a relevant response has been returned from the API. This still allows folks to use the common method(data, callback) pattern, but in a very unlinked w Anyway… **npm install actionhero-client** and you are good to go!quests and limit the number of ‘active’ queries (coming in a future version), and to timeout requests which take to long. Anyway… [npm install actionhero-client](https://npmjs.org/package/actionhero_client) and you are good to go! --- --- url: /blog/post/2012-04-11-actions-vs-tasks.md description: >- I recently got a note from an actionHero user asking about the difference between tasks and actions within the framework. Here’s a bit of a… --- ![](/images/medium-export/1__MOFEPTlPprbExqciQ8W6JA.jpeg) I recently got a note from an actionHero user asking about the difference between tasks and actions within the framework. Here’s a bit of a clarification: Think of an **action** as something happens in-line within the execution of a request. This would be something like "load this image and send it to the user" or "update this database entry". Normal website stuff. A **task** is something that is meant to be executed later by another process. This would be something like "send an email" or "update the cache". Think of this like a *delayed* *job or rescue job* if you are coming from the Rails word. The way that actionHero handles **actions** is that any member of the cluster (any running node.js process) will respond to the request locally; just like a normal web server serving up a web page. **Tasks** however are enqueued then drained from the queue one by one. Each actionCluster member will run one task at a time, and then go on to the next one. When you define a **task**, you will say whether it can run on **"any"** member of the cluster or **"all"** members of the cluster (and if they should be periodically done at some frequency). For example, [this task](https://github.com/evantahler/actionHero/blob/master/tasks/saveCacheToDisk.js) is to be run by every member of the actionCluster, and it will save the local cache object to disc every hour to a file. We want every cluster member to do this to help recover from a crash. But say you are building a system which will send an email to new users who sign up. Rather than having the "signup conformation" page wait until the email is sent, you might consider enqueuing a task "send email". You only want the user to receive 1 email, so "any" member of the action cluster can process this task (rather than "all"). You don’t care which one sends it, just as long as it gets done. Currently, only the master (first) process in the actionCluster will process "any" actions, but in the near future he will be able to re-delegate those tasks back out to other members of the cluster so that more peers can help drain the queue. I’m also working on an enhancement which run tasks in another thread/fork to help better utilize multi-core servers. --- --- url: /blog/post/2012-03-18-actionhero-v1.md description: HUZZAH! --- ![](/images/medium-export/1__KhGHUH__VYLotH6PAJZDg5Q.jpeg) ### HUZZAH! [ActionHero.js](http://actionherojs.com), the API framework for both HTTP and raw TCP clients is now at version 1! This release took over a month to push out, and focused on adding a full test suite so that we can all be sure that everything works as it should. This release also refactored the hell out of the network task system. Now any peer in the cluster can submit a task to be run on the master peer. I wasn’t however able to accomplish my "philosophical" goal of having every node acting in an equal capacity to run tasks and to be a failover point for an enqueued task. That will (hopefully) be in the next version. Anyway, I am finally satisfied with the stability of the server and the interface. Enjoy! Here is the full change log: * initializers * you can add your own initializers to a folder called initializers in your project’s root. They will be loaded at boot and accessible like this: * actionHero.myNewInitFunction(api, function(){ next(); }); * This is a cleanup and bug-fix release * After some refactoring, actionHero is now at v 1.0 * The last message sent by a socket client can now be read by inspecting connection.lastLine * Better error handling if the socket / web port are in use * Cleanup of the example HTML pages * HTTP requests will now return serverInformation.currentTime * The original server (the one with no configData.actionCluster.startingPeer will be the only server to run ‘any’ tasks) * Other servers will NOT assume the role of running "any" tasks, but they will keep them in memory until the master server comes back upon a crash. * Using the node-mime module #### Head over to \[\[ ]] to check it out or download it from [github](https://github.com/evantahler/actionHero/) or [NPM](http://search.npmjs.org/#/actionHero) [**evantahler/actionhero**](https://github.com/evantahler/actionHero/) --- --- url: /blog/post/2016-07-07-actionheros-resque-ui.md description: 'Today I released a Resque UI for ActionHero, ah-resque-ui.' --- Today I released a Resque UI for ActionHero, [**ah-resque-ui**](https://github.com/evantahler/ah-resque-ui)**.** ![](/images/medium-export/1__CCa__6uFsJZQSWBlO4ajz8Q.png) This project has been in the works for a \*long time\*. I started the [node-resque](https://github.com/taskrabbit/node-resque) project at @TaskRabbit in 2013… and until now, there has been no [Node.js](https://medium.com/u/96cd9a1fb56) way to view the data in your resque [Redis](http://redis.io) databse. I’ve been advising folks to spin up the Ruby [resque-web](https://github.com/resque/resque-web) project if they needed a UI… but that wasn’t a very good solution for a [Node.js](https://medium.com/u/96cd9a1fb56) project! For those of you who don’t know, Resque is a specification for a few data structures, implemented in Redis, which is used to enqueue, store, and work delayed jobs in a software application. Reque was first written by the folks at @github in Ruby, and has since been extended to almost every other language. There are even alternative implementations of the workers (the running code which consume and works these enqueued jobs) in Ruby, most famously [Sidekiq](http://sidekiq.org/). Resque is the backing store for [ActionHero’s](http://www.actionherojs.com/) Task system. [**ah-resque-ui**](https://github.com/evantahler/ah-resque-ui) creates a number of actions which interface with ActionHero’s task system, and therefore node-resque under the hood. The front end is written in Angular 1.5. This plugin uses route injection and a proxy middleware so you can protect your actions (as they do allow you to mess with server-side data). This is the first time I’ve used the proxy-middleware pattern, and I quite like it: ```js module.exports = { load: 99999999, initialize: function (api, next) { var middleware = { "ah-resque-ui-proxy-middleware": { name: "ah-resque-ui-proxy-middleware", global: false, preProcessor: function (data, callback) { return callback(); }, }, }; if (api.config["ah-resque-ui"].middleware) { var sourceMiddleware = api.actions.middleware[api.config["ah-resque-ui"].middleware]; middleware["ah-resque-ui-proxy-middleware"].preProcessor = sourceMiddleware.preProcessor; middleware["ah-resque-ui-proxy-middleware"].postProcessor = sourceMiddleware.postProcessor; } api.actions.addMiddleware(middleware["ah-resque-ui-proxy-middleware"]); next(); }, }; ``` You can download [**ah-resque-ui**](https://github.com/evantahler/ah-resque-ui) from NPM and add it as a plugin to your ActionHero project with a simple: npm install --save ah-resque-ui npm run actionhero -- link --name ah-resque-ui Enjoy! --- --- url: /blog/post/2012-03-23-actionhero-now-at-level-2.md description: actionHero is growing up so fast! --- actionHero is growing up so fast! ![](/images/medium-export/1__0JdqTgFyIb3kndbeZwBe6g.png) actionHero has undergone some significant improvements over the past 2 months to make him stronger and faster! We now use a [redis](http://redis.io/) backend to handle shared memory and for keeping track of tasks. Many of these ideas were stolen from the [Resque](https://github.com/defunkt/resque) project. ### As always, you can get the project from the [website](http://actionherojs.com/), [GitHub](https://github.com/evantahler/actionHero), or [NPM](http://search.npmjs.org/#/actionHero). Here are the release notes ([which are always available here](https://github.com/evantahler/actionHero/blob/master/versions.markdown)): #### V2.0.0 \*\* Redis-powered Cluster & major refactor \*\* \*\* Details \*\* This version realizes the dream of a true cluster for actionHero. There is no longer a need for a master process, and every node in the cluster can work alone or in a group. This release also enables using the node.js cluster module to make the most of your server(s). This version is likley to be incompatible with prior versions. Running an actionCluster now requires redis (running a single node does not require redis and is still a pure node.js implementation). Using a redis backend, actionHero nodes can now share memory objects and have a common queue for tasks. Philosophically, we have changed from a mesh network to a queue-based network. This means that no longer will every node talk to every other node, but rather all nodes will talk to redis. Now, I know that you are thinking "isn’t that bad because it creates a single point of failure?" Normally, the answer is yes, but redis already has mesh networking support! The suggested method of deployment is to setup a redis instance on each server, and they will handle the mesh networking for you. api.cache now works in a shared way if you are part of an actionCluster. All cache actions refer to redis and in this way, all peers can have access to shared ojects. To avoid conflicts, you will have access to ‘lastReadAt’ as part of api.cache.load responses. actionHero will also no longer store its own cache to disc periodically as redis does this already. The task system also has undergone some major refactoring. All tasks are now stored in a shared queue within redis. All peers will periodically check the queue for unfilled tasks, and drain the queue one at a time. In this manner, you can add more task capacity by spinning up more actionHero nodes which may or may not also handle web/socket traffic. This also means that tasks will not get lost if a node crashes as they will remain in the redis queue until drained. Each peer also has a ‘personal’ task queue for "all" actions. For periodic tasks ("any" and "all"), the peer which most recently completed the task while hold the semaphore for that task (in a actionHero::tasksClaimed shared list) until the proper amount of time has elapsed, then they will re-enqueue the task. This does not mean that a specific node will always perform tasks of the same type. There are new requirements to config.json to configure redis. Here is an example: ``` "redis" : { "enable": true, "host": "127.0.0.1", "port": 6379, "password": null, "options": null }, ``` All methods under the api.actionCluster namespace have been removed for simplicity. Just use the normal cache methods, and if you are in a cluster, you will operate in a shared memory space. \*\* Notes \*\* * all peers now share the same api.cache data * api.tasks.enqueue is now api.tasks.enqueue(api, taskName, runAtTime, params) Set runAtTime to 0 to run the task as soon as possible * using redis cache save; no longer saving cache periodically * all nodes are created equal; there is no need for a master * the entire actionCluster namesapace has been removed * there are new requirements to config.json to setup redis * every node will try to handle requests and process one job pending in the task queue at a time * shared tasks will be prefered over per-node tasks * the ‘status’ action has some new output type to reflect ‘global’ stats in comparison to ‘local’ stats (IE: count of web requests that this node has served vs total) --- --- url: /blog/post/2019-10-21-actions-tasks-and-destructured-params.md description: Welcome to the fourth installment of The Illustrated Actionhero Community Q&A! --- Welcome to the fourth installment of The Illustrated [Actionhero](https://www.actionherojs.com/) Community Q\&A! Every week in October I’ll be publishing a conversation from the [Actionhero Slack community](http://slack.actionherojs.com/) that highlights both a feature of the Actionhero Node.JS framework and the robustness of the community’s responses… and adding some diagrams to help explain the concept. ### Online and Offline Sync October 21st, 2019 [Source conversation in Slack](https://actionherojs.slack.com/archives/C04EVSUSD/p1566736126151100) Actionhero community member Nick asks: > I’ve noticed when running the latest AH, if I destructure the data param in an action run function to `{params, response, connection}`, when I write the output to response my endpoint returns nothing, unless I do an `Object.assign()`. Is this expected behavior? After some back and forth with other members of the community: > …honestly I’ve seen this behavior for some time, since the move to async and ES6… I believe around AH 17 First… what is destructuring? > Destructuring is a programming shorthand to simply variable assignment by "breaking" the structure of complex objects or arrays. For example, these are valid examples of destructing: ![](/images/medium-export/1__phVzjPGTY__ky66bFEsDnEw.png) In both cases, we’ve set the variables `firstName` and `lastName` without having to reach "into" the complex array or object. To learn more about all the cool things destructuring can do, I recommend [this excellent article by the team at Mozilla](https://hacks.mozilla.org/2015/05/es6-in-depth-destructuring/). Lets take a look at the Action in question: ```js const { Action, api } = require("actionhero"); module.exports = class ListAvailableLessonDays extends Action { constructor() { super(); this.name = "ListAvailableLessonDays"; this.description = "Description"; this.inputs = { gradeNumber: { required: true }, }; } async run({ params, response }) { const { gradeNumber } = params; const { LessonService } = api.services; const { getAvailableLessonDays } = LessonService; const lessons = await getAvailableLessonDays(gradeNumber); response = lessons; // <-- problem! } }; ``` Nick is building a tool to help teachers manage their curriculums. A requestor provides a `gradeNumber` and the API then returns a list of saved lessons. They are destrucuring the input object `data` to his run method into `params` and `response`. We can see the data passed into an an Action’s run method: ![](/images/medium-export/1__R8PFi8FBcnld8uXKCwBzVQ.png) Since Actonhero can handle connections from many different types of connections (http, websocket, direct TCP socket, etc), we need a generic way to represent the request to an action. Inside Actionhero, we have multiple types of servers responsible for handling each type of connection, and building a generic `connection` object, and figuring out what the request `parameters` (or `params` for short) are. The server is also responsible for sending the `response` of your action back to the client. To make a simple API for all of this, your actions `run` method is passed one big `data` object with everything you might need. ```js data = { connection: connection, action: "randomNumber", toProcess: true, toRender: true, messageId: 123, params: { action: "randomNumber", apiVersion: 1 }, actionStartTime: 123, response: {}, }; ``` To learn more about how actions work, the [Action Tutorial](https://docs.actionherojs.com/tutorial-actions.html) has a lot of great information. Nick continues his investigation: > This code returns an empty response If I leave it as data, and then do `data.response = lessons`, it returns the array as expected or if I do an `Object.assign(response, lessons)`, it will return the data but with the array converted to an object, for obvious reasons Said another way… ![](/images/medium-export/1__WjivaLxe__ZdJ6RMcgRxVYw.png) … why? Community member Chad saves the day: > This is standard ES6 behavior, a common gotcha. When you destructure you take a reference to the property in question. It is a pointer to it. If you say `response = lessons` you overwrite the POINTER, not the VALUE OF IT. > You are "repointing" your local var response to point to a local value lessons, not altering the pointer within the original data object. You could safely set `response.someValue`. But not overwrite the entire response itself. So, if you are adding properties to `response`, (like `response.message)`, you can use a restructured response, but if you are overwriting the entire response object, you should not destructure the inputs to your Action’s`run` method. --- --- url: /blog/post/2014-04-24-active-record-mysql-and-emoji.md description: No more Emoji! --- More and more, people are adopting [emoji](http://en.wikipedia.org/wiki/Emoji) in their online communications. At [TaskRabbit](https://www.taskrabbit.com), we noticed that our users are starting to use emoji all over the place, from task descriptions to reviews. There are some problems when supporting the emoji character set wit our stack, which includes Rails 4.0 and MySQL. The main problem is that MySQL’s utf8 encoding does not actually support multi-byte strings, which emoji relies on. In MySQL 5.5, the utf8mb4 encoding was introduced which allows for Multi-Byte (mb) strings… and therefore emoji would work! The [MySQL gem](https://github.com/brianmario/mysql2) introduced support for utf8mb4 about a year ago, but only recently did active\_record (and rails) add support for this in rails 4.1. Initially, we decided to ignore all emoji characters, literally stripping them out of strings with our [demogi](https://github.com/taskrabbit/demoji) gem (Thanks [Pablo](http://tech.taskrabbit.com/blog/2014/02/14/emojis)!). However, with our [new product launch in the UK](http://www.taskrabbit.co.uk), we thought it was time to actually address the problem. Here is what we learned: ### Migrating MySQL from utf8 to utf8mb4 The good news is that the upgrade path from utf8 to utf8mb4 is easy. As we are *adding* bytes, the migration is really just a definition change at the table-level. Nothing has to change with your existing data. This is a non-blocking and non-downtime migration. If you are using normal rails migrations, all of your column types for VARCHAR columns will be based on the table’s encoding. Changing the table will change the column type. The bad news is that any text-type (or blob-type) columns will need to be explicitly changed. Check out the migration steps: 1. change the DB’s encoding entirely, so new tables will be created in utf8mb4 2. alter all existing tables 3. explicitly update text-type columns ```ruby class Utf8mb4 < ActiveRecord::Migration UTF8_PAIRS = { 'users' => 'notes', 'comments' => 'message' # ... } def self.up execute "ALTER DATABASE `#{ActiveRecord::Base.connection.current_database}` CHARACTER SET utf8mb4;" ActiveRecord::Base.connection.tables.each do |table| execute "ALTER TABLE `#{table}` CHARACTER SET = utf8mb4;" end UTF8_PAIRS.each do |table, col| execute "ALTER TABLE `#{table}` CHANGE `#{col}` `#{col}` TEXT CHARACTER SET utf8mb4 NULL;" end end def self.down execute "ALTER DATABASE `#{ActiveRecord::Base.connection.current_database}` CHARACTER SET utf8;" ActiveRecord::Base.connection.tables.each do |table| execute "ALTER TABLE `#{table}` CHARACTER SET = utf8;" end UTF8_PAIRS.each do |table, col| execute "ALTER TABLE `#{table}` CHANGE `#{col}` `#{col}` TEXT CHARACTER SET utf8 NULL;" end end end ``` ### database.yml The only change here is to change the encoding: ```yml development: adapter: mysql2 encoding: utf8mb4 # <--- HERE database: my_db_name username: root password: my_password host: 127.0.0.1 port: 3306 ``` ### Index Lengths The last step here is to worry about index lengths, as mentioned above. If you are on rails 4.1, you have nothing to worry about! The rest of us have a few options: 1. [monkeypatch activerecord](https://github.com/rails/rails/issues/9855#issuecomment-28874587) 2. [change the index length within MySQL](http://dev.mysql.com/doc/refman/5.6/en/innodb-parameters.html#sysvar_innodb_large_prefix) 3. set the length to 191 within all index migrations We chose #2 due to the simplicity of the solution. Check the links above for a detailed discussion of the problem. ```ruby module ActiveRecord module ConnectionAdapters class AbstractMysqlAdapter NATIVE_DATABASE_TYPES[:string] = { :name => "varchar", :limit => 191 } end end end ``` And now you can emoji to your ❤’s content! --- --- url: /blog/post/2011-11-02-agile-business-intelligence.md description: >- A few months ago, Kate, Jenn, and I (all Modcloth co-workers) gave a talk at the world-famous pivotal labs about our unique "agile" way of… --- A few months ago, Kate, Jenn, and I (all [Modcloth](http://modcloth.com) co-workers) gave a talk at the world-famous [pivotal labs](http://pivotallabs.com/) about our unique "agile" way of approaching business intelligence problems in a Agile-development sort of way. We also spoke to developers can use come common/free BI tools to make their lives easier. ### [Watch the video here](http://pivotallabs.com/talks/148-business-intelligence-tools-for-engineers) --- --- url: /blog/post/2014-02-13-airbrake-and-actionhero.md description: 'If you are using ActionHero version 10x or higher, you can now use the plugin:' --- If you are using ActionHero version 10x or higher, you can now use the plugin: [**evantahler/ah-airbrake-plugin**](https://github.com/evantahler/ah-airbrake-plugin) At [work](http://www.taskrabbit.com) we use [Airbrake](http://airbrake.io) to monitor exceptions in production. It’s a great tool that sends an email when something goes wrong (among other things). Now that we are using [ActionHero](http://www.actionherojs.com) in production, I wanted to integrate airbrake. To do this, I released [version 7.6.5](https://github.com/evantahler/actionhero/releases/tag/v7.6.5) which adds the ability to create custom error reporters. ActionHero already worked all requests and tasks within a domain and reported on them, but that reporting was limited to the output of the [winston](https://github.com/flatiron/winston) logger. Winston has some great plugins, but airbrake was not one of them. Once we had custom error reporting support, I moved the initial winston logger (using api.log) to the new format: ```js var consoleReporter = function (type, err, extraMessages, severity) { for (var i in extraMessages) { var line = extraMessages[i]; api.log(line, severity); } var lines = err.stack.split(os.EOL); for (var i in lines) { var line = lines[i]; api.log("! " + line, severity); } api.log("*", severity); }; api.exceptionHandlers.reporters.push(consoleReporter); ``` Felix has made a [great airbrake library for node](https://github.com/felixge/node-airbrake), which is simple to use with ActionHero’s new error reporter: ```js // in initializers/airbrake.js var airbrakePrototype = require("airbrake"); exports.airbrake = function (api, next) { api.airbrake = {}; api.airbrake.token = api.config.general.airbrake_token; api.airbrake.client = airbrakePrototype.createClient(api.airbrake.token); api.airbrake.client.handleExceptions(); // catch global uncaught errors // api.airbrake.client.developmentEnvironments = []; // don't report in various NODE_ENVs api.airbrake.notifier = function (type, err, extraMessages, severity) { api.airbrake.client.notify(err); }; api.airbrake._start = function (api, next) { api.exceptionHandlers.reporters.push(api.airbrake.notifier); next(); }; api.airbrake._stop = function (api, next) { next(); }; next(); }; ``` The last peace of the puzzle was informing airbrake about deployments, and clearing the previous errors. The airbrake package already has support for this, so we just needed to make a grunt task we call on deployment: ```js // in /guntfile.js grunt.registerTask( "notifyAirbrakeDeploy", "tell airbrake we deployed", function (message) { var done = this.async(); init(function (api) { api.airbrake.client.trackDeployment(function () { done(); }); }); }, ); ``` --- --- url: /blog/post/2023-06-01-airbyte-checkpointing.md description: 'Airbyte Checkpointing: Ensuring Uninterrupted Data Syncs' --- ![Checkpointing!](/images/posts/2023-06-01-airbyte-checkpointing/image.jpg) ## Transient Failures A sync can fail for all sorts of reasons. Maybe there’s a network outage, maybe one of the processes managing the sync ran out of memory, or perhaps someone rebooted a router. Whatever the reason, it is impractical to think that a sync lasting a few hours moving terabytes of data won’t have some interruption. Airbyte’s job is to make sure your data continues to flow, even through these interruptions, and we do that through a process called **checkpointing**. Checkpointing is powerful because it means we can sync any volume of data, given enough time and retries. ## What is checkpointing? Checkpointing is a mechanism which the Airbyte Platform uses to resume any [incremental sync](https://docs.airbyte.com/understanding-airbyte/connections/#sync-modes) from where it left off in the previous attempt. Over the past year, we have been working to ensure that all the parts of Airbyte, our Sources, Platform, and Destinations can work in concert to support checkpointing. We are proud to share that today any source which supports incremental syncs is checkpointable (totally a real word), as are all of our cloud data warehouse destinations, including [Snowflake](https://docs.airbyte.com/integrations/destinations/snowflake/), [BigQuery](https://docs.airbyte.com/integrations/destinations/bigquery/), and [Redshift](https://docs.airbyte.com/integrations/destinations/redshift/). Our traditional SQL destinations ([MySQL](https://docs.airbyte.com/integrations/sources/mysql/), [Postgres](https://docs.airbyte.com/integrations/destinations/postgres/), etc) and [s3](https://docs.airbyte.com/integrations/destinations/s3/) also support checkpointing, with more destinations on the way. Airbyte’s checkpointing target is that no more than 30 minutes will pass without a checkpoint. That means no more than 30 minutes of sync time will need to be replayed on the next sync attempt if the sync were to fail. Checkpointing is only valid for connections which support incremental syncs, because it relies on asking the source to begin the next sync from a previous state. 30 minutes is the upper boundary, and many sources emit [state messages](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol#state--checkpointing) more rapidly, often every time they paginate an API, or adjust limit or offset in a query. Destinations on the other hand have a fine line to walk - they need to balance efficiently writing to the destination with a guarantee to hit that 30 minute mark. For example, we use [staging files](https://docs.snowflake.com/en/user-guide/data-load-considerations-stage) to upload data to Snowflake, rather than INSERT queries directly, because it is (usually) faster. But, because there’s a cost to every time we upload a file and ask Snowflake to insert it, our destinations balance rapid checkpoints with efficient writes. ## How does checkpointing work? So how does checkpointing work? [The Airbyte Protocol](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol#state--checkpointing) of course! Consider the following sync: ![Checkpointing!](/images/posts/2023-06-01-airbyte-checkpointing/whimsical.png) In this example, the source sent 10 records and 3 state messages though the Airbyte platform before crashing. Checkpointing works on STATE messages. If a source sends a state message out, and the destination echos that same state message back to the platform, that means “I have committed all the records the source gave me up to this point”. So, when the destination sends back State message A, that means that it has saved Record 1, Record 2, and Record 3 (e.g. persisted it in the destination database or uploaded it to S3 - whatever “persisting” means to that destination). Only at this point, when the destination confirms that the data is saved on its end, do we have a checkpoint. In this example where State B was checkpointed but not State C, that means we have checkpointed up to Record 6. The next time we run this sync, we will start at State B, meaning we can skip records 1 through 5, and start with record 7, saving both time and money. Observant readers will note that this will result in the last few records being sent more than once, which is by design - Airbyte is an “at least once” delivery platform. But don’t worry, many of our destinations have additional features, like deduplication, to clean up data on the other side. That’s one of the many neat things about an [ELT pipeline](https://airbyte.com/blog/elt-pipeline) - moving the data and cleaning the data happen independently! This allowed Airbyte to choose speed and reliability over “at most once” delivery of records. ## What’s next? So what’s next? Airbyte will continue to add checkpointing to our destinations as they reach the [Generally Available release stage](https://airbyte.com/blog/connector-release-stages). Also, now that our cloud data warehouse destinations are resilient to failure, we are speeding them up! We are also making the tables we produce more intuitive and able to recover from problematic data - learn more about this work [here](https://github.com/airbytehq/airbyte/issues/26028). Keep on Syncing! --- --- url: /blog/post/2011-12-23-announcing-actionoero-js.md description: >- I am working on a nodeJS framework! It’s out in beta RIGHT NOW. I would love some feedback. --- ![](/images/medium-export/1__0MnhZhYoNstrGV5HWjyv7w.png) I am working on a nodeJS framework! It’s out in beta RIGHT NOW. [I would love some feedback](https://github.com/evantahler/actionHero). [**evantahler/actionhero**](https://github.com/evantahler/actionHero) Here’s a slice from the readme: Who is an actionHero? actionHero is a minimalist transactional API framework written in javaScript for the node.js server. It was inspired by the DAVE PHP framework. The goals of actionHero are to create an easy-to-use package to get started making combination http and socket APIs RIGHT NOW. The actionHero API aims to simplify and abstract may of the common tasks that these types of APIs require. actionHero does the work for you, and he's not CRUD, and he's never taking a REST. I was tired of bloated frameworks that were designed to run as monolithic applications which include M's, V's, and C's together in a single running application. This tethering of view to business logic doesn't make much sense in modern web development when your presentation layer can just as easily be a mobile application or a website. There are also many scaling issues when you expect your single application to be able to handle all these separate types of consumers. The actionHero API defines a single access point and accepts GET and POST input. You define "Actions" which handle input and response, such as "userAdd" or "geoLocate". The actionHero API is NOT "RESTful", in that it does not use the normal http verbs (Get, Put, etc) and uses a single path/endpoint. This was chosen to make it as simple as possible for devices/users to access the functions, including low-level embedded devices which may have trouble with all the HTTP verbs. To see how simple it is to handle basic actions this package comes with a few basic Actions included. Check them out in api/actions/. You can get it and try it out RIGT NOW via github or [npm](http://search.npmjs.org/#/actionHero) Discuss the project on [Hacker News](http://news.ycombinator.com/item?id=3390456) Please use gitHub to leave me any comments about this project (including requests for more database types!) --- --- url: /blog/post/2026-03-13-announcing-keryx.md description: >- After 14 years of building Actionhero, I built a new framework from scratch. Keryx lets you write one Action class and deploy it across HTTP, WebSocket, CLI, background tasks, and MCP — all with the same validation and middleware. --- ![Announcing Keryx: A Full-Stack TypeScript Framework for APIs and MCP Servers](/images/posts/2026-03-13-announcing-keryx/image.png) I've been building [Actionhero](https://github.com/actionhero/actionhero) for over 14 years. It started as a side project in 2011 — a Node.js server that could speak HTTP and WebSocket from the same codebase, with background jobs built in. The core idea was simple: write your business logic once, expose it everywhere. Over the years, Actionhero picked up a few thousand GitHub stars, got used in production by companies I never expected, and even got approved by the VA for healthcare systems. I'm proud of it. But the world has changed. TypeScript won. Bun happened. Zod became the standard for validation. And then MCP showed up. [Keryx](https://www.keryxjs.com/) is what I'd build if I started Actionhero today — which is exactly what I did. ## What Is Keryx? Keryx is a full-stack TypeScript framework for building APIs and MCP servers. The core philosophy is the same one that drove Actionhero: you write your controller once, and it works across every transport your application needs. But "every transport" means something different in 2026 than it did in 2012. A single Keryx Action automatically becomes: * An **HTTP endpoint** with JSON/form data support * A **WebSocket handler** for real-time communication * A **CLI command** with auto-generated flags * A **background task** via Resque workers * An **MCP tool** that AI agents can discover and use Same validation. Same middleware. Same error handling. Five transports, one class. ## One Action, Every Transport Here's what that looks like in practice: ```typescript export class UserView implements Action { name = "user:view"; description = "View a user's profile by ID or email"; inputs = z.object({ id: z.number().optional(), email: z.string().email().optional(), }); web = { route: "/user", method: HTTP_METHOD.GET }; mcp = { tool: true }; async run(params: ActionParams) { const user = await findUser(params); return { user: serializeUser(user) }; } } ``` That's it. This class is simultaneously a `GET /user` endpoint, a WebSocket action, a `user:view` CLI command, and an MCP tool. The Zod schema drives input validation for every transport, and it auto-generates your OpenAPI documentation. An AI agent running `user:view` gets the same validation and error handling as a curl request hitting `/user` — because it's the same code path. If you've used Actionhero, this should feel familiar. If you haven't, the idea is straightforward: your business logic shouldn't care how a request arrived. ## Why MCP Changes Things The first four transports — HTTP, WebSocket, CLI, tasks — those are table stakes for a full-stack framework. Actionhero had all of them (well, CLI was a stretch). The reason I built Keryx instead of continuing to evolve Actionhero is the fifth one: MCP. [MCP](https://modelcontextprotocol.io/) (Model Context Protocol) is how AI agents discover and use tools. It's becoming the standard interface between agents and the services they interact with. And here's the thing — if you're already defining your actions with typed inputs, descriptions, and structured outputs… you're 90% of the way to an MCP tool. The shape of a good Action and the shape of a good MCP tool are nearly identical. Keryx makes the last 10% automatic. Every Action is registered as an MCP tool with zero additional configuration. Your Zod schema becomes the tool's input schema. Your Action's name becomes the tool name. An AI agent can discover your API the same way a human developer reads your OpenAPI docs — except the agent gets a protocol it natively understands. That also means Keryx gives you per-session agent isolation and OAuth 2.1 with PKCE out of the box. Because when agents are calling your API, authentication and scoping aren't optional — they're the whole game. ## Modern Defaults Actionhero was built in a world of callbacks and `npm install`. Keryx is built for how we write TypeScript today: * **Bun** as the runtime — native TypeScript execution, no compilation step, fast * **Zod** for validation — type-safe schemas that generate OpenAPI docs * **Drizzle ORM** with auto-migrations — your database schema lives in TypeScript * **Redis** for PubSub, caching, and job queues (via [node-resque](https://github.com/actionhero/node-resque), which also got a refresh) * **OpenTelemetry** for metrics and structured logging — observability from day one These aren't plugins you install later. They're built in, configured, and working when you scaffold a new project. I've started enough projects to know that the things you defer to "later" are the things that never get done. ## What About Actionhero? Actionhero isn't going anywhere. It's stable, it works, and people depend on it. But I'm not going to pretend it's where my energy is going. Keryx is the framework I want to build with, and I think it's the better choice for new projects. If you're running Actionhero in production today, there's no urgency to migrate. If you're starting something new… take a look at Keryx. ## Getting Started ```bash bunx keryx new my-app cd my-app cp .env.example .env bun install bun dev ``` You'll need Bun, PostgreSQL, and Redis installed locally. The scaffolded project comes with example Actions, database migrations, and a working MCP server — so you can point an AI agent at it immediately. The docs are at [keryxjs.com](https://www.keryxjs.com/), and the project is MIT licensed. Source: [keryxjs.com](https://www.keryxjs.com/) *Evan Tahler is Head of Engineering at [Arcade](https://arcade.dev). He's the creator of [Actionhero](https://github.com/actionhero/actionhero), [node-resque](https://github.com/actionhero/node-resque), and now [Keryx](https://www.keryxjs.com/).* --- --- url: /blog/post/2024-04-04-announcing-record-change-history.md description: Don't stop syncing just because your data is bad. --- ![a big record](/images/posts/2024-04-04-announcing-record-change-history/image-1.png) Airbyte’s job is to move data between systems in the best possible way. But what exactly does "best" mean in this context? It embodies the balance of many, often competing goals. One of those balancing acts is the compromise between ensuring the highest level of data precision and achieving compatibility across various systems. ## The Challenge of Data Compatibility In the realm of data movement, one of the most important aspects we deal with is data compatibility. The [Airbyte Protocol](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol) describes a series of [data types](https://docs.airbyte.com/understanding-airbyte/supported-data-types) that all sources must serialize their content into while in transit, and we require that all destinations be able to store every one of these types. This creates a common language and it’s how destinations can be interoperable with [as many sources as possible](https://docs.airbyte.com/integrations). This setup not only enables compatibility across diverse systems but also allows our platform to effectively validate data from any source and offer features like [column selection](https://airbyte.com/blog/airbyte-column-selection-control-over-the-exact-data-to-sync) for every sync process. However, these data types, while deliberately chosen for maximum compatibility, are limited. For instance, you might notice the absence of a decimal type. This choice is a strategic one, aiming to ensure that every source and destination can work with the available types. Yet, even within this well-thought-out system, we sometimes encounter challenges. ## Enhancing Sync Reliability with Record Change History At Airbyte, we’ve been working to ensure that **one bad record won’t break your sync**, and that lead us to the addition of Record Change History. This new feature offers a way to inform you that a record was modified in transit, to prevent such record from otherwise being un-syncable. When we announced [Destinations V2](https://airbyte.com/blog/introducing-airbyte-destinations-v2-typing-deduping), we called out that this opened the door to new error-handling capabilities, and today we are happy to share one of them! ## How Record Change History Works? Record Change History can best be demonstrated with an example. Imagine a record like this in a Postgres database: ```js { type: "RECORD", record: { stream: "users", emitted_at: 123456789, data: { id: 1, first_name: "Evan", last_name: "Tahler", description: "Hello, my name is Evan, and I like long walks on the beach, but also computers and then also... (25MB of text follows)" } } } ``` In this case, the description column from the users table in this Postgres database holds a very large text entry. Before Record Change History, the source-postges database connector would try to serialize this record and probably succeed. But then, the Snowflake destination would have trouble since [Snowflake semi-structured data can only be 16MB](https://docs.snowflake.com/en/sql-reference/data-types-semistructured), causing the entire sync to fail due to this single oversized record. Previously, the only workaround would be to use the column selection feature to skip the description column entirely, but then all other rows, with reasonably sized content, would be skipped as well. Now, with Record Change History, we have the tools to allow Airbyte to modify a record in-transit to solve certain classes of problems which we know won’t be able to make it all the way to the destination. In the previous example, the excessively large description would be nulled, and this modification would be transparently communicated to the users. These changes are recorded in a query-friendly manner in the destination, keeping you informed and your data syncs uninterrupted. The record with modifications, by the time the destination will store it, becomes: ```js { type: "RECORD", record: { stream: "users", emitted_at: 123456789, data: { id: 1, first_name: "Evan", last_name: "Tahler", description: null // <--- changed! }, meta: { changes: [ { field: "description", change: "NULLED", reason: "DESTINATION_FIELD_SIZE_LIMITATION" } ] } } } ``` And that means that you’ll now have new information about each record’s change history in your [V2 Destinations’s final tables](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2): ![final table!](/images/posts/2024-04-04-announcing-record-change-history/image-2.png) ## Benefits of Record Change History This approach has a number of advantages: * **Resilience against problematic rows**. A single problematic row no longer means a failed sync, which significantly boosts the reliability of your data movement. * **Maintained query experience**. The vast majority of syncs don’t have per-record problems, so the query experience in your destination remains unchanged for most use-cases. * **Compatibility with aggregations and analysis**. Even when changes are made to records, the majority of your data analyses, such as aggregations, remain viable. Typically, issues arise from just one oversized column, not the entire row. So, most of your data can still be used effectively. * **Informed decision-making**. With detailed information about any changes made to a row, you can decide how to incorporate that row into your analysis. Going to our previous example, perhaps you aren’t using the description column (e.g. you want to count how many new users you had today), so you can consider the changed row in your analysis. On the other hand, if you want to train some machine learning models on user descriptions, you should probably skip that row. * **Easy monitoring of changes**. Finally, tracking any modifications made during the sync process is straightforward. You can easily monitor for changes with a simple query or integrate data quality tools like [Great Expectations](https://greatexpectations.io/) for more advanced monitoring. ```sql // Check for changed records SELECT COUNT(1) from users_final_tale where length(_airbyte_meta.changes) > 0 ``` ## Types of Data Challenges The Record Change History Can Handle Today, Record Change History is used within Airbyte for the following 3 classes of problem: 1. **Record or property size overflows**. If at any point the destination cannot ‘fit’ the record, it will be nulled such that it can. This is especially important for Redshift destinations which not only has a per-SUPER limit, but also a [per-JSON property limit](https://docs.aws.amazon.com/redshift/latest/dg/limitations-super.html). We won’t ever null out a primary key or cursor field - problems with those special properties will still fail the sync, as that would render the record untraceable. 2. **Serialization issues**. Different databases have different validation and quality guarantees on the data that they can store. For example, in most versions of MySQL, it’s valid to store datetimes that aren’t real (e.g. February 31st as a date). Before Record Change History, Airbyte would validate that the timestamp values are valid, so this would fail to serialize. It would also fail to be loaded into a strict destination, like Snowflake or Redshift. Now, we null out that datetime, lest the whole sync fail. 3. **Typecasting issues**. At times, we encounter sources that return data with the wrong type. For example, if a source declares that the id column is a number, but returns a string, that wouldn’t work for most destinations: If you try to insert “number” into id on Snowflake, an exception will be raised. “Range” issues often fall into this category as well, for example, if the destination receives a negative value for a year (e.g. B.C.E dates), it would fail to be cast into a datetime, even though the year “200 BCE” is valid. Now, with Record Change History, those situations don’t fail the syncs. Each of these types of problems will have a unique error reason in your destination data warehouses. Over the next few months, we will be adding support for Record Change History to all certified sources and destinations at Airbyte. This is just one of many projects underway to dramatically improve our reliability, even in the face of strange data! Do you have any questions or feedback for us? You can keep in touch by joining our Slack channel. If you would like to keep up to date with new Airbyte features, subscribe to our newsletter. --- --- url: /blog/post/2015-03-12-ansible-static-dynamic-inventory.md description: Using Ansible with a Dynamic list of hosts --- ![](/images/medium-export/1__MpJklbc7MeGNCgX3TvL1HQ.png) At [TaskRabbit](https://www.taskrabbit.com) we use [Ansible](http://www.ansible.com) to configure and manage our servers. Ansible is a great tool which allows you write easy-to-use playbooks to configure your servers, deploy your applications, and more. ### The problem Normally, you run ansible commands from your laptop as you need them. This is great when provisioning or deploying, but annoying that it would be hard to automate. Ansible has a product called [Ansible Tower](http://www.ansible.com/tower) which allows you to run those same commands via a web-UI, schedule them, and respond to web hooks. Tower is a nifty piece of software that does a lot of things right, however we were having trouble keeping out inventories (lists of servers) up-to-date between the lists in our ansible git repository and the the Tower server itself. The main issue is a change in philosophy. Ansible (the CLI tool) expects that your inventories live local to the project in MAKEFILE-like files located in sensible places like ./inventories/production and ./inventories/staging. Ansible Tower expects that your inventory is dynamic, and always obtainable from a remote source like Amazon EC2’s API, or from a VMware Cluster. While we do use these services to host our servers, not all servers that are present should be ansible’d, and more importantly, not all variables that ansible needs will be obtainable from those sources. In the Ansible project repo, you can keep both the groups and lists of servers, along with variables like this: ```bash ########### ## HOSTS ## ########### mysql-master.myapp.com mysql-slave1.myapp.com mysql-slave2.myapp.com redis.myapp.com web1.myapp.com web2.myapp.com web3.myapp.com resque1.myapp.com resque2.myapp.com ############ ## GROUPS ## ############ [production] mysql-master.myapp.com mysql-slave1.myapp.com mysql-slave2.myapp.com redis.myapp.com web1.myapp.com web2.myapp.com web3.myapp.com resque1.myapp.com resque2.myapp.com [production:vars] host_memory=8GB host_disk=20GB ansible_ssh_user=root ## DB ## [mysql] mysql-master.myapp.com mysql-slave1.myapp.com mysql-slave2.myapp.com [mysql:master] mysql-master.myapp.com [mysql:vars] host_memory=32GB host_disk=5120GB [redis] redis.myapp.com [app] web1.myapp.com web2.myapp.com web3.myapp.com resque1.myapp.com resque2.myapp.com [app:unicorn] web1.myapp.com web2.myapp.com web3.myapp.com [app:resque] resque1.myapp.com resque2.myapp.com ``` This type of layout allows you to define things in a simple way: — hosts belong to groups — groups can have variables — you can override default variables with later group definitions down the file. To demonstrate this, you can see how all servers start with 8GB of RAM, but the mysql group later overrides this to 32GB. You also get the added bonus of having your entire infrastructure defined in one place. Our workflow appends this file when we add and remove servers. This means that with a simple git pull you can be sure that any ansible command you run will be run on the correct collection of servers. We wanted Tower to source the same file developers would be using locally, and not reading in (potentially divergent information) via APIs. Ansible Tower has a feature called "[Dynamic Inventory](http://docs.ansible.com/intro_dynamic_inventory.html)" which allows you to define your inventory via some other method, as long as it presents a standardized JSON output. Tower can reference these things as what they call an "Inventory Script". Using these tools, the question became: "How can we source a file as if it were a changing API?" The answer had a few parts (in ruby): ### 1. Find the inventory file Tower does not keep the git repo of your ansible project(s) in a single place. It versions them and moves them around as you update it. To that end, finding the most current version of your ./inventories/produciton file is non trivial: ```ruby class InventoryFinder def find(inventory_file) # On Production server if File.exists? '/var/lib/awx/projects/' folder = Dir.glob('/var/lib/awx/projects/*').max { |a,b| File.ctime(a) <=> File.ctime(b) } return folder + '/inventories/' + inventory_file # Assume we are within the proper project else return File.dirname(__FILE__) + '/../inventories/' + inventory_file end end end ``` ### 2. Parse the Inventory You can define groups and variables in a few legal ways within an inventory file. You can do the \[group:vars] method in the example above, or you can do it in-line as you define the server for the first time. Keeping all this in mind, here’s our parser: ```ruby class InventoryParser def initialize(inventory_path) @inventory_path = inventory_path @data = { "_meta" => { "hostvars" => {} } } end def inventory_path @inventory_path end def data @data end def ignored_variables [ 'ansible_ssh_user' ] end def file_lines File.read( inventory_path ).split("\n") end def parse current_section = nil file_lines.each do |line| parts = line.split(' ') next if parts.length == 0 next if parts.first[0] == "#" next if parts.first[0] == "/" if parts.first[0] == '[' current_section = parts.first.gsub('[','').gsub(']','') if data[current_section].nil? && !current_section.include?(':vars') data[current_section] = [] end next end # varaible block if !current_section.nil? && current_section.include?(':vars') parts = line.split('=') key = parts[0] value = parts[1] col = current_section.split(':') col.pop group = col.join(':') fill_hosts_with_group_var(group, key, value) # host block (could still have in-line variables) else hostname = parts.shift ensure_host_variables(hostname) d = {} while parts.length > 0 part = parts.shift words = part.split('=') d[words.first] = words.last unless ignored_variables.include? words.first end data[current_section].push(hostname) if current_section d.each do |k,v| data["_meta"]["hostvars"][hostname][k] = v end end end return data end def ensure_host_variables(hostname) if data["_meta"]["hostvars"][hostname].nil? data["_meta"]["hostvars"][hostname] = {} end end def fill_hosts_with_group_var(group, key, value) return if ignored_variables.include? key if value.include?("'") || value.include?('"') value = eval(value) end data[group].each do |hostname| ensure_host_variables(hostname) data["_meta"]["hostvars"][hostname][key] = value end end end ``` You will also note that we choose to ignore certain variables, via ignored\_variables, that we want defined somewhere else within ansible tower (for example SSH options). As a note, one feature of ansible’s inventory DSL that is not supported here is the notion of [children](http://docs.ansible.com/intro_inventory.html) ### 3. Running it Once those classes are defined, you can create a single file (per environment) like so: ```ruby #!/usr/bin/env ruby require 'json' class InventoryFinder #... end class InventoryParser #... end path = InventoryFinder.new.find('production') data = InventoryParser.new(path).parse puts JSON.pretty_generate( data ) ``` You can load this code into the dynamic inventory and it will be ready to run! ### 4. Keeping it in sync The final step is to ensure that any time a job is run from Tower, both the project repository and inventory are always updated. There are a few hooks you need to enable to do so: First, on the setting for the project, you can enable a git pull before each project run. Be sure to enable Update on Launch under SCM options. Then, the same option, Update on Launch can be enabled under the inventory source. When you define your inventory, you need to source is a "custom script", and from there, you can choose the inventory reader defined above. With this place, we are able to have our cake and eat it too: — one file which contains all of our configuration — allow developers to keep an up-to-date inventory source locally within the git ansible project — Ansible Tower can source that file, and ensure that it is up-to-date before we run any job --- --- url: /blog/post/2015-07-14-ansible-tips-and-tricks.md description: >- On Monday 2015–07–14 I gave a (remote) talk as the Pittsburgh Code & Supply Meetup entitled "TaskRabbit’s Ansible Tips & Tricks”. --- ![](/images/medium-export/1__w0iIGUfsxNXvUBKqrC__uSA.png) On Monday 2015–07–14 I gave a (remote) talk as the [**Pittsburg Code & Supply Meetup**](http://www.codeandsupply.co) entitled "TaskRabbit’s Ansible Tips & Tricks”. The talk gave an overview of the journey TaskRabbit took from simple bash scrips, to Chef, and then to Ansible. I share some of the tips and tricks we learned along the way, and some hacks to make your team more efficient. --- --- url: /blog/post/2016-07-23-ansible-lets-encrypt-nginx-and-actionhero.md description: >- If you don’t already know LetsEncrypt! is an awesome project which aims to bring free HTTPS certificate to every site on the web. HTTPS… --- ![](/images/medium-export/1__bKxD__GvU2YDGkBdZNhVXcQ.png) If you don’t already know [LetsEncrypt!](https://letsencrypt.org/) is an awesome project which aims to bring free HTTPS certificate to every site on the web. HTTPS makes everything safer and more secure by protecting your information and browsing history while it is in transit from your computer to the server. Traditionally, getting an HTTPS certificate was a confusing and expensive process. It was also something of a racket, as certificate providers rarely provided a technical *service* per-se, just a guarantee that they were keeping their certificates safe, and that your website’s certificate, which was based off of theirs, was also safe. HTTPS trust goes something like "I trust DigiCert.com, and DigiCert.com says that *site.com* is safe… so I guess I trust site.com!" Anyway, you can read more about how LetsEncrypt! works on their site. [SwitchBoard.chat](https://switchboard.chat/) is a small web site I’m running (based on [ActionHero](http://www.actionherojs.com/) of course) which allows your team to send and receive SMS messages to a centralize place, share messages and address books, and generally makes working with SMS for your team easier. As SwithBoard.chat is a small service right now, the front-end runs entirely on one server… and is a great candidate for a basic LetsEncrypt! HTTPS certificate. > *You can read the SwitchBoard.chat launch announcement* [*here*](https://blog.evantahler.com/switchboard-chat-d1aa51478dc6#.nw7yx1rkl)*.* The fine folks at the EFF have created [CertBot](https://certbot.eff.org/), an easier-to-use wrapper around the LetsEncrypt! command line tools. [Ansible](https://www.ansible.com/) is a tool which helps you configure your servers (and deploy to them) automatically. We can combine both of these to make automatic HTTPS certificate generation a breeze! > *An aside about LetsEncrypt!: Unlike a normal certificate authority, who grants certificates for 1,2, or 3 years at a time, LetsEncrypt! \*only\* grants 3-month certificates. They do this because they \*want\* to encourage automation and re-generation of certificates. This encourages folks to constantly be proving that they really do own the domain in question, and leads to a safer internet for all of us.* For this setup, we are going to set up our front-end server like this: ![](/images/medium-export/1__p__dgj6h4dx6GNWUm5PPU3w.png) First, ensure your DNS records are pointing to the server. We now have a chicken-and-egg problem. We want to run NGINX to serve out site (and validations for LetsEncrypt!) but we don’t have our certs yet, son NGINX won’t boot. So the first time we run this, we need to run a temporary web server, but every subsequent time, we’ll use Nginx. To generate those certs, here’s what we do: ```yaml # tasks/main.yml - name: install certbot dependencies apt: name={{ item }} state=present with_items: - build-essential - libssl-dev - libffi-dev - python-dev - git - python-pip - python-virtualenv - dialog - libaugeas0 - ca-certificates - name: install Python cryptography module pip: name=cryptography - name: download certbot become: yes become_user: "{{ deploy_user }}" get_url: > url=https://dl.eff.org/certbot-auto dest=/home/{{ deploy_user }}/certbot-auto - name: chcek if we've generated a cert already stat: path=/etc/letsencrypt/live/switchboard.chat/fullchain.pem register: cert_stats - name: generate certs (first time) become: yes # become_user: '{{ deploy_user }}' shell: "/home/{{ deploy_user }}/certbot-auto certonly --standalone {{ letsencrypt_domain_flags | join(' ') }} --email {{ letsencrypt_email}} --non-interactive --agree-tos" when: cert_stats.stat.exists == False - name: generate certs (subsequent time) become: yes # become_user: '{{ deploy_user }}' shell: "/home/{{ deploy_user }}/certbot-auto certonly --webroot -w /home/{{ deploy_user }}/www/switchboard.chat/current/public {{ letsencrypt_domain_flags | join(' ') }} --email {{ letsencrypt_email}} --non-interactive --agree-tos" when: cert_stats.stat.exists == True - name: hup nginx service: name=nginx state=reloaded ``` The variables look like ```yaml # From group_vars/production deploy_user: "me" letsencrypt_email: "boss@switchboard.chat" letsencrypt_domain_flags: - "-d switchboard.chat" - "-d www.switchboard.chat" - "-d api.switchboard.chat" ``` Here are the steps broken out: * Install CertBot & dependancies (for Ubuntu/Debian) * Check if we are running the first time (no cert on system yet) or a subsequent time * Build our Install script. * Reload Nginx **How does this work?** If we are running the first time we are telling CertBot to spin up a temporary web server using the \` — standalone\` flag. If we already have a running web server (NGINX) e are telling CertBot to use ActionHero’s public directory to place it’s trust files. What CertBot does is generate some custom files (which look like */.well\_known/{{gibberish}}*) and then tells the LetsEncrypt! server to try and load those generated URLs. If it can, it knows that you own the DNS addresses in question, and it grants you the certificate you asked for! CertBot can also run its own web server for the domain tests, but since we are already running NGINX, we don’t need to! In our Ansible server provisioning step, we’ll then set up and run our ActionHero project. You should configure it listen only on a local socket, IE: ```js // from config/servers/web.conf exports.production = { servers: { web: function (api) { return { port: "/home/deploy/www/switchboard.chat/shared/sockets/actionhero.sock", bindIP: null, padding: null, metadataOptions: { serverInformation: false, requesterInformation: false, }, }; }, }, }; ``` … and then you can configure NGINX to load your ActionHero project as a backend: Our NGINX.conf (and the Ansible role to manage NGINX) looks like this: ```yaml # handlers/main.yml - name: restart nginx service: name=nginx state=restarted - name: reload nginx service: name=nginx state=reloaded ``` ```text # templates/production.conf.j2 #user nobody; worker_processes 2; error_log /var/log/nginx/error.log warn; pid /var/run/nginx.pid; events { worker_connections 1024; # increase if you have lots of clients accept_mutex on; # "on" if nginx worker_processes > 1 } http { include mime.types; default_type application/octet-stream; server_tokens off; sendfile on; keepalive_timeout 65; server_names_hash_bucket_size 64; types_hash_max_size 2048; gzip on; gzip_http_version 1.0; gzip_comp_level 9; gzip_proxied any; gzip_types text/plain text/xml text/css text/comma-separated-values text/javascript application/javascript application/x-javascript font/ttf font/otf image/svg+xml application/atom+xml; log_format main '$remote_addr - $remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for" $request_time'; server { listen 80; server_name _; location /nginx_status { stub_status on; access_log on; allow 127.0.0.1; deny all; } location / { rewrite ^(.*) https://www.switchboard.chat$1 permanent; } } server { listen 443; server_name switchboard.chat; ssl on; ssl_certificate /etc/letsencrypt/live/switchboard.chat/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/switchboard.chat/privkey.pem; ssl_prefer_server_ciphers On; ssl_protocols TLSv1 TLSv1.1 TLSv1.2; ssl_session_cache shared:SSL:10m; return 301 https://www.switchboard.chat$request_uri; } server { proxy_redirect off; listen 443 default_server; server_name _; ssl on; ssl_certificate /etc/letsencrypt/live/switchboard.chat/fullchain.pem; ssl_certificate_key /etc/letsencrypt/live/switchboard.chat/privkey.pem; ssl_prefer_server_ciphers On; ssl_protocols TLSv1 TLSv1.1 TLSv1.2; ssl_session_cache shared:SSL:10m; access_log /var/log/nginx/access.switchboard_chat.log main; error_log /var/log/nginx/error.switchboard_chat.log; client_max_body_size 10M; location /primus { proxy_http_version 1.1; proxy_buffering off; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "Upgrade"; proxy_set_header Host $host; proxy_pass http://unix:/home/{{ deploy_user }}/www/switchboard.chat/shared/sockets/actionhero.sock; } location / { root /home/{{ deploy_user }}/www/switchboard.chat/current/public/; expires 1m; try_files /$uri/index.html /$uri.html /$uri @app; } location @app { proxy_pass http://unix:/home/{{ deploy_user }}/www/switchboard.chat/shared/sockets/actionhero.sock; } } } ``` ```yaml # tasks/main.yml - name: ensure the nginx dir file: path=/etc/nginx state=directory owner=root - name: ensure the nginx log dir file: path=/var/log/nginx state=directory owner=nobody group=nogroup - name: ensure the default site is removed file: path=/etc/nginx/sites-{{ item }}/default state=absent with_items: - enabled - available notify: - restart nginx - name: nginx.conf template: src=production.conf.j2 dest=/etc/nginx/nginx.conf notify: - reload nginx - name: install nginx apt: pkg=nginx state=present notify: - restart nginx - meta: flush_handlers ``` You’ll notice that we are using \`try*files\` to attempt to have NGINX serve static files out of ActionHero’s public directory. While ActionHero \_can* service static assets for you, NGINX is simply better and faster at it. This also allows NGINX to continue serving assets if the ActionHero server is down for some reason, and you can use JavaScript on in your front-end code to render an appropriate error message to your visitors if a health check fails. NGINX is also sourcing our HTTPS certificates from a strange place… This is where CertBot will place our certs. That directory is actually a collection of Symlinks which CertBot will update as needed as it renews them for us. Now, every time you run Ansible, you will check if there are updates needed to your HTTPS certificates, and get new versions. **HTTPS Automated.** --- --- url: /blog/post/2013-03-10-api-first-reboot.md description: >- After siting idle for over a year, Fiona (Fiona ODMC) and I have decided to reboot API-First.com. --- After siting idle for over a year, [Fiona](http://twitter.com/fionaodmc) ([Fiona ODMC](https://medium.com/u/a2165e95e3bf)) and I have decided to reboot [API-First.com](http://api-first.com). ![](/images/medium-export/1__ngHc__S43t468vMf7KDKpiw.jpeg) We haven’t been checking it regularly, but the site seems to have been making an impact. Without any promotion, we were getting around 50 unique visitors a day. It also looks like we piqued the interest of some industry thought leaders. I’ll be blogging over at [API-First.com](http://api-first.com) from time to time, so head over and check it out! For those of you who don’t know, API-First is a way of designer software and structuring teams around the idea that you should rally around an API. This helps your application scale to multiple devices and ensure your teams can work efficiently in a distributed way. The main change to API-Frist.com site is that it is now [OPEN SOURCE](https://github.com/evantahler/api-first). We have removed the comments section in favor of pull requests. We swapped the format to a blog because we want to hear your stories of how developing in an API-First way has helped or hurt your teams (helped, I hope!). Fork the repo and add your stories today! [**evantahler/api-first**](https://github.com/evantahler/api-first) --- --- url: /blog/post/2023-01-03-architecture-notes-2022-ctf.md description: How I solved the Architecture Notes 2022 Capture the Flag Puzzles --- ![Capture the Flag](/images/posts/2023-01-03-architecture-notes-2022-ctf/ctf.png) [Capture the Flag](https://en.wikipedia.org/wiki/Capture_the_flag_\(cybersecurity\)) challenges are basically Escape Rooms for software engineers. It's an obscure puzzle with opaque instructions that challenge you to think outside the box and learn new things along the way. They are both fun and frustrating, rewarding and ridiculous. [Architecture Notes](https://architecturenotes.co) is a new newsletter started in 2022 that covers software architecture and system design, mostly focused on web technologies. I'm a subscriber, and you should be to! This is the first year they've existed, and the first year they have done a CTF. What follows is are my spoiler-ridden notes on how I solved the challenge(s). ## You can try the CFT yourself [here](https://ctf.architecturenotes.co/) Good luck! *** ## Flag 1: Encryption You start on a simple enough page, [ctf.architecturenotes.co](https://ctf.architecturenotes.co/). There isn't really much explaining what do to, so I decide to read the source of the page. Lucky for us, it's static HTML and (possibly) written by hand, so it's easy to follow. The only thing that looks out-of-place is: ```html first programmer ever? ``` Ok, so the first programmer ever is probably [Ada, countess of Lovelace](https://en.wikipedia.org/wiki/Ada_Lovelace), but what do I do with that information? I downloaded the image in that tag, and then went to work trying to figure out what to do with it. The first dead-end I hit was looking for slightly-off-color pixels (e.g. [something like this](https://null-byte.wonderhowto.com/how-to/guide-steganography-part-3-hide-text-and-images-pictures-0130893/)) in the image which would reveal information. In Photoshop, I converted the image to grayscale and messed with the saturation, to no avail. For my next dead-end, I remembered reading an article about how you can hide strings within the raw image body, so I opened it up in vim. While looking at the mostly gibberish image content in vim, I did find part of the contents that was more-or-less human readable XML! This was the EXIF header information for the image. If you open up the image in most viewers, MacOS `preview` included, you can see the EXIF information formatted better: ![Capture the Flag](/images/posts/2023-01-03-architecture-notes-2022-ctf/exif.png) That description looks rather sus! Knowing that this challenge has *something* to do with encryption, I guess I should try to decrypt the string! Being very lazy, I googled "decryption tools" and found myself at... web-based [decryption tools](https://www.devglan.com/online-tools/rsa-encryption-decryption). I tried various formats and passphrase permutations of `ada` and `lovelace`, with no luck. I also remembered that most AES encrypted strings tend to end with `==`, and this one did not. Time to move on to encodings! It turns out that this string was base64 encoded (thanks [base64decode.org](https://www.base64decode.org)!). Once you decode it, you get a more traditional-looking AES public key (ending in `==`, but more importantly starting with `--BEGIN PUBLIC KEY--`), and a helpful message! > ok, I see you. Now what? maybe there is something at /flag that could help. If you CURL `htps://ctf.architecturenotes.co/flag` you learn that you are expected to POST some data... but what? Well, after a few tries and help from the fine folks at [Stack Overflow](https://stackoverflow.com/questions/67648523/how-to-encrypt-a-small-text-using-openssl-with-a-given-public-key), I ended up encrypting the word "ada" with the public key and posting it: ```bash echo "... the key ..." > pub.key echo "ada" > plaintext.txt openssl rsautl -encrypt -pubin -inkey pub.key -in plaintext.txt | base64 > encrypted.txt curl -d @encrypted.txt "https://ctf.architecturenotes.co/flag" -v ``` And finally, I had the CTF solved! A novel way to prove you got to the end was the creator of the puzzle, [@myusuf3](https://twitter.com/myusuf3), asked you to tweet something very specific to him. It's a great way for him to boost his Twitter followers, and start a dialogue with the players. As a note, the headers for `ctf.architecturenotes.co/flag` contain cache information for Cloudflare, which was another dead-end: ```text report-to: {"endpoints":[{"url":"https:\/\/a.nel.cloudflare.com\/report\/v3?s=lCArkhBrXosMGmu8L4d1VHb0Qwlq3pZbWsBj%2BcbllnCsBgkdJLOO%2Ff2gedkaTl%2FSAwxuDQDMGjhMAranlRf7dZIa4%2FHFroO8cEfPntCmZxY50C0N7%2FsvPxTXENyO%2B17tawYenmMB1WtSZL%2FPFlq6UZtjS53qaOs%3D"}],"group":"cf-nel","max_age":604800} ``` That was fun, and took around an hour to solve. On to the next one! ## Flag 2: Coming Soon `#TODO` --- --- url: /blog/post/2013-02-18-authetntication-with-actionhero.md description: '!! This post is out of date! For an up-to-date version, click here' --- It seems that [ActionHero](http://actionherojs.com) has been picking up some popularity lately, and I’ve been getting a few questions about creating an authentication system with actionHero. Here’s a short post with some examples on how to get this done. ### What do we need? There is **not** an authentication or user system which ships with actionHero. There’s not even an ORM. There are many great ORMs out there, and actionHero doesn’t have an opinion on which one you should use. However, for a user system, you do need *some* sort of persistence. For this example, we’ll be using: * a mysql database * the [sequelize](http://www.sequelizejs.com/) ORM to help us with migrating the database and models * actionHero’s built-in cache to handle user sessions By no means is this a "full" production-ready authentication system, but this should serve as an example to get you started. ### Setting up the project There are a few new folders we need to make to keep our project sane. Here’s my folder structure (with non-standard actionHero directories **bolded**): ```text \ | - actions | - initializers | - log | - pids | - **migrations** | - **models** | - node_modules | - public | - tasks ``` ### Setting up the database First, we need to set up the database. Sequelize has 2 methods of manipulating tables: model sync and migrations. We’ll be using migrations so we can incrementally update our schema if we need to. Let’s create a migration to make our users table: #### **migrations/addUserTable.js** ```js // Note! The real name of your migration must be in the sequelize's timestamp format, and look something more like `20130326205332-addUserTable.json` // It would be best to use the `sequelize` binary to build your migration file. module.exports = { up: function (migration, DataTypes) { migration.createTable("Users", { id: { type: DataTypes.INTEGER, primaryKey: true, autoIncrement: true }, createdAt: { type: DataTypes.DATE }, updatedAt: { type: DataTypes.DATE }, email: { type: DataTypes.STRING, defaultValue: null, allowNull: false, unique: true, }, passwordHash: { type: DataTypes.TEXT, defaultValue: null, allowNull: true, }, passwordSalt: { type: DataTypes.TEXT, defaultValue: null, allowNull: true, }, firstName: { type: DataTypes.TEXT, defaultValue: null, allowNull: true, }, lastName: { type: DataTypes.TEXT, defaultValue: null, allowNull: true, }, }); }, down: function (migration) { migration.dropTable("Users"); }, }; ``` and a sequelize model which will use this new table. Note that the model doesn’t need to be informed about the ID and timestamps because we are using the defaults from sequelize #### **models/user.js** ```js module.exports = function (sequelize, DataTypes) { return sequelize.define("User", { email: { type: DataTypes.STRING, unique: true, validate: { isEmail: true, }, }, passwordHash: { type: DataTypes.TEXT }, passwordSalt: { type: DataTypes.TEXT }, firstName: { type: DataTypes.TEXT }, lastName: { type: DataTypes.TEXT }, }); }; ``` Now we need an initializer to run everything. Check out mysql.js which will connect to the database and run any migrations we have pending. ```js var fs = require("fs"); exports.mysql = function (api, next) { api.SequelizeBase = require("sequelize"); api.sequelize = new api.SequelizeBase( api.configData.mySQL.database, api.configData.mySQL.username, api.configData.mySQL.password, { host: api.configData.mySQL.host, port: api.configData.mySQL.port, dialect: "mysql", }, ); api.models = {}; var files = fs.readdirSync("models"); var models = []; for (var i in files) { models.push(files[i].split(".")[0]); } models.forEach(function (model) { api.models[model] = api.sequelize.import( __dirname + "./../models/" + model + ".js", ); }); var initDB = function (next) { var migrator = api.sequelize.getMigrator( { path: process.cwd() + "/migrations" }, true, ); migrator .migrate() .success(function () { api.sequelize.sync().success(function () { api.log("migrations complete", "notice"); next(); }); }) .error(function (err) { console.log("error migrating DB: "); throw err; process.exit(); }); }; initDB(next); }; ``` Note this example expects we would have added the following to config.js: ```js configData.mySQL = { database: "actionHero", username: "root", password: null, host: "127.0.0.1", port: 3306, }; ``` Booting the server should now create your users table. ### Sessions Now that we have a users table, how should we handle sessions? We want to create a session store that works not just for http(s) clients, but also for persistent websocket and tcp clients. We can use actionHero’s built-in store (which will be redis-backed in most cases) to help us out. Here’s an other initializer: #### **initializers/sessions.js** ```js exports.sessions = function (api, next) { api.session = { prefix: "__session", duration: api.configData.general.sessionDuration, }; api.session.save = function (connection, next) { var key = api.session.prefix + "-" + connection.id; var value = connection.session; api.cache.save(key, value, api.session.duration, function () { api.cache.load(key, function (savedVal) { if (typeof next == "function") { next(); } }); }); }; api.session.load = function (connection, next) { var key = api.session.prefix + "-" + connection.id; api.cache.load( key, function (error, value, expireTimestamp, createdAt, readAt) { connection.session = value; next(value, expireTimestamp, createdAt, readAt); }, ); }; api.session.delete = function (connection, next) { var key = api.session.prefix + "-" + connection.id; api.cache.destroy(key, function (error) { connection.session = null; next(error); }); }; api.session.checkAuth = function ( connection, noAuthCallback, happyAuthCallback, ) { api.session.load( connection, function (value, expireTimestamp, createdAt, readAt) { if (connection.session === null) { connection.session = {}; } else { var now = new Date().getTime(); if (connection.session.loggedIn != true) { connection.error = "You need to be authorized for this action"; noAuthCallback(connection, true); } else { // check to ensure the user is still ok in the DB api.models.user .find({ where: { id: connection.session.userId }, }) .success(function (user) { if (user == null) { connection.error = "This user has been deleted"; api.session.delete(connection, function () { noAuthCallback(connection, true); }); } else { connection.auth = "true"; happyAuthCallback(null, user); } }); } } }, ); }; next(); }; ``` There’s another config setting in use here: configData.general.sessionDuration = (1000 \* 60 \* 60 \* 4), // 4 hours. Note the api.sessions.checkAuth method. Here’s what we will using to validate actions are being called by logged in and valid users. Because sessions and connections might exist for a long while, we need to re-check the user against both the session store and the database each action. ### Creating a user. Here’s our first action: creating a user. This action doesn’t require any authentication because we need to allow new people to sign up. #### **actions/userAdd.js** ```js var crypto = require("crypto"); var action = {}; ///////////////////////////////////////////////////////////////////// // metadata action.name = "userAdd"; action.description = "I will create a new user (non-authenticated action)"; action.inputs = { required: ["email", "password", "firstName", "lastName"], optional: [], }; action.outputExample = {}; ///////////////////////////////////////////////////////////////////// // functional action.run = function (api, connection, next) { if (connection.params.password.length < 6) { connection.error = "password must be longer than 6 chars"; next(connection, true); } else { var passwordSalt = api.utils.randomString(64); var passwordHash = crypto .createHash("sha256") .update(passwordSalt + connection.params.password) .digest("hex"); api.models.user .build({ email: connection.params.email, passwordHash: passwordHash, passwordSalt: passwordSalt, firstName: connection.params.firstName, lastName: connection.params.lastName, }) .save() .success(function (user) { next(connection, true); }) .failure(function (error) { connection.error = error.message; next(connection, true); }); } }; ///////////////////////////////////////////////////////////////////// // exports exports.action = action; ``` Of note here is that each user gets a random salt, and we use SHA256 for our hash storage or the password (never actually store a users’ password!). You can use any hash function you like. ### Logging in. Now that we have a user, we can log him in. The goal of logging in, is to create a session for the user with auth = true. actionHero will already take care of laying cookies down for http(s) clients, and other clients will have a persistent and unique session.id which we can use as the session key. #### **actions/login.js** ```js var crypto = require("crypto"); var action = {}; ///////////////////////////////////////////////////////////////////// // metadata action.name = "login"; action.description = "I will log a user in"; action.inputs = { required: ["email", "password"], optional: [], }; action.outputExample = {}; ///////////////////////////////////////////////////////////////////// // functional action.run = function (api, connection, next) { api.models.user .find({ where: { email: connection.params.email }, }) .success(function (user) { if (user === null) { connection.error = "user not found"; next(connection, true); } else { var passwordHash = crypto .createHash("sha256") .update(user.passwordSalt + connection.params.password) .digest("hex"); if (user.passwordHash != passwordHash) { connection.error = "passwords don't match"; next(connection, true); } else { connection.session = { userId: user.id, loggedIn: true, }; connection.auth = "true"; if (connection._original_connection != null) { connection._original_connection.auth = "true"; } connection.response.userId = user.id; connection.response[ api.configData.commonWeb.fingerprintOptions.cookieKey ] = connection.id; api.session.save(connection, function () { next(connection, true); }); } } }) .error(function (error) { connection.error = error; next(connection, true); }); }; ///////////////////////////////////////////////////////////////////// // exports exports.action = action; ``` ### An authenticated action OK! We now have a logged in user, what we can we let him do!? Using our one helper method from before (api.session.checkAuth), we can allow this logged-in user to change some of his saved data in the database: #### **actions/userEdit.js** ```js var action = {}; ///////////////////////////////////////////////////////////////////// // metadata action.name = "userEdit"; action.description = "I edit a user"; action.inputs = { required: ["userId"], optional: ["firstName", "lastName", "email"], }; action.outputExample = {}; ///////////////////////////////////////////////////////////////////// // functional action.run = function (api, connection, next) { api.session.checkAuth(connection, next, function (err, dbUser) { var newData = {}; if (connection.params.email != null) { newData.email = connection.params.email; } if (connection.params.firstName != null) { newData.firstName = connection.params.firstName; } if (connection.params.lastName != null) { newData.lastName = connection.params.lastName; } dbUser.updateAttributes(newData).success(function () { next(connection, true); }); }); }; ///////////////////////////////////////////////////////////////////// // exports exports.action = action; ``` ### Fin This may seem like a lot, but in only 6 short files, we created everything we need from scratch for a working authentication system! Now you can extend this to your needs! --- --- url: /blog/post/2013-06-10-authentication-with-actionhero-again.md description: An update look at authenticating with Actionhero --- *Update @ 2014–05–11: As of ActionHero v8.0.8, connection.id is no-longer static for all web requests, in favor of connection.rawConnection.fingerprint. This post has been updated* ### Intro I had previously written about [authenticating with ActionHero](/2013-02-18-authetntication-with-actionhero), but that post is out of date as of actionHero v6.0.0. There have been some breaking API changes in actionHero which changed how connections work. Also, that first post was an overly complex example requiring a mysql database and ORM. As most folks are looking for an archetypical example of how to authenticate, I thought that it would be best to make it as simple as possible. ### Notes * We use actionHero’s [cache](https://github.com/evantahler/actionHero/wiki/Cache) methods which probably should not be used in production for this purpose. You can substitute the database of your choice within your own application. * Note that only 2 actions are needed, one to create the user and one to log in. * For HTTP clients, actionHero drops a session cookie which sets the connection.rawConnection.fingerprint. More information can be found [here](https://npmjs.org/package/browser_fingerprint). Logging-in will bind the session to the id of the http client, which is set in a cookie. * We create some common session methods to save and load a session in the cache for the connection which can be located and modified by actions. * note that when calling the actionCounter, the session.actionCounter is increased and stored. This is just so we can test that evrything it working. ### Setup * Create a new actionHero project as described on [www.actionHerojs.com](http://www.actionHerojs.com) * create 3 new files * ./node\_modules/.bin/actionHero generateAction — name user * ./node\_modules/.bin/actionHero generateAction — name authenticatedAction * ./node\_modules/.bin/actionHero generateInitializer — name session ### initializers/session.js ```js exports.session = function (api, next) { api.session = { prefix: "__session:", duration: 60 * 60 * 1000, // 1 hour }; api.session.connectionKey = function (connection) { if (connection.type === "web") { return api.session.prefix + connection.rawConnection.fingerprint; } else { return api.session.prefix + conneciton.id; } }; api.session.save = function (connection, session, next) { var key = api.session.connectionKey(connection); api.cache.save(key, session, api.session.duration, function (error) { if (typeof next == "function") { next(error); } }); }; api.session.load = function (connection, next) { var key = api.session.connectionKey(connection); api.cache.load( key, function (error, session, expireTimestamp, createdAt, readAt) { if (typeof next == "function") { next(error, session, expireTimestamp, createdAt, readAt); } }, ); }; api.session.delete = function (connection, next) { var key = api.session.connectionKey(connection); api.cache.destroy(key, function (error) { next(error); }); }; api.session.generateAtLogin = function (connection, next) { var session = { loggedIn: true, loggedInAt: new Date().getTime(), }; api.session.save(connection, session, function (error) { next(error); }); }; api.session.checkAuth = function ( connection, successCallback, failureCallback, ) { api.session.load(connection, function (error, session) { if (session === null) { session = {}; } if (session.loggedIn !== true) { connection.error = "You need to be authorized for this action"; failureCallback(connection, true); // likley to be an action's callback } else { successCallback(session); // likley to yiled to action } }); }; next(); }; ``` ### actions/user.js ```js var crypto = require("crypto"); var redisPrefix = "__users-"; var caluculatePassowrdHash = function (password, salt) { return crypto .createHash("sha256") .update(salt + password) .digest("hex"); }; var cacheKey = function (connection) { return ( redisPrefix + connection.params.email.replace("@", "_").replace(".", "_") ); }; exports.userAdd = { name: "userAdd", description: "userAdd", inputs: { required: ["email", "password", "firstName", "lastName"], optional: [], }, blockedConnectionTypes: [], outputExample: {}, run: function (api, connection, next) { if (connection.params.password.length < 6) { connection.error = "password must be longer than 6 chars"; next(connection, true); } else { var passwordSalt = api.utils.randomString(64); var passwordHash = caluculatePassowrdHash( connection.params.password, passwordSalt, ); var user = { email: connection.params.email, firstName: connection.params.firstName, lastName: connection.params.lastName, passwordSalt: passwordSalt, passwordHash: passwordHash, }; console.log(cacheKey(connection)); api.cache.save(cacheKey(connection), user, function (error) { connection.error = error; connection.response.userCreated = true; next(connection, true); }); } }, }; exports.logIn = { name: "logIn", description: "logIn", inputs: { required: ["email", "password"], optional: [], }, blockedConnectionTypes: [], outputExample: {}, run: function (api, connection, next) { connection.response.auth = false; console.log(cacheKey(connection)); api.cache.load(cacheKey(connection), function (err, user) { if (err) { connection.error = err; next(connection, true); } else if (user == null) { connection.error = "User not found"; next(connection, true); } else { var passwordHash = caluculatePassowrdHash( connection.params.password, user.passwordSalt, ); if (passwordHash !== user.passwordHash) { connection.error = "incorrect password"; next(connection, true); } else { api.session.generateAtLogin(connection, function () { connection.response.auth = true; next(connection, true); }); } } }); }, }; ``` ### actions/authenticatedAction.js ```js exports.action = { name: "authenticatedAction", description: "authenticatedAction", inputs: { required: [], optional: [], }, blockedConnectionTypes: [], outputExample: {}, run: function (api, connection, next) { api.session.checkAuth( connection, function (session) { if (session.actionCounter == null) { session.actionCounter = 0; } session.actionCounter++; connection.response.authenticated = true; connection.response.session = session; api.session.save(connection, session, function () { next(connection, true); }); }, next, ); }, }; ``` ### Run it! ```text http://localhost:8080/api/userAdd?email=evan@evantahler.com&password=password&firstName=Evan&lastName=tahler http://localhost:8080/api/logIn?email=evan@evantahler.com&password=password http://localhost:8080/api/authenticatedAction ``` All the error cases work as expected (password miss-match, trying to visit authenticatedAction before logging in, etc.) ### What this example doesn’t do * edit and delete users * check that a user still exists in api.session.checkAuth * uses a real ORM/database --- --- url: /blog/post/2015-02-16-background-tasks-in-node.md description: >- On Thursday 2015–02–05 I gave a talk as the awesome SFNode Meetup entitled "Background Jobs for NodeJS". --- On Thursday 2015–02–05 I gave a talk as the awesome [SFNode Meetup](http://www.meetup.com/sfnode/) entitled "Background Jobs for NodeJS". ![](/images/medium-export/0__p6PmuaW4OvZfk7wd.jpg) The talk gave an overview of some of the many ways you can preform background tasks in node, which include: * Foreground (in-line) * Parallel (threaded-ish) * Local Messages (fork-ish) * Remote Messages * Remote Queues (Resque-ish) * Event Bus (Kafka-ish) For every section, we show an example, and more interestingly, note how node makes every step better/faster/stronger… even the bad ideas! The idea for the talk came from a twitter conversation with [Dan Shaw](https://medium.com/u/7c861ae496fa), host of [NodeUP](http://nodeup.com/) about how easy it was to have multiple node workers in [Node-Resque](https://github.com/taskrabbit/node-resque)… Check out the presentation to learn how! [**evantahler/background\_jobs\_node**](https://github.com/evantahler/background_jobs_node) Video: --- --- url: /blog/post/2016-05-11-background-tasks-in-nodejs-a-survey-with-redis.md description: >- Today I gave a talk at RedisConf in San Francisco entitled: Background Tasks in Node.js: A survey with Redis. --- Today I gave a talk at [RedisConf](http://redisconference.com/) in San Francisco entitled: ***Background Tasks in Node.js: A survey with Redis.*** This was a talk I have wanted to give for a long time. As a DevOps engineer, there are a lot of ways to do everything, and when it comes to background processing, there are many ways to consume that job/worker/event/message. ![](/images/medium-export/1__o4crV68ZzYEiJ__l9zgqmXQ.png) This talk went though my personal evolution with background jobs, starting by doing them in-web-thread (PHP) and moving all the way up to event-bus systems. What was good and bad with each implementation, where and why did was it worth adding complexity, etc. These strategies discussed are: * Foreground (in-line) * Parallel (thread-ish) * Local Messages (fork-ish) * Remote Messages (\*MQ-ish) * Remote Queue (redis + Resque) * Immutable Event Bus (Kafka-ish) Learn about many of the possible background task strategies you can use in your app… and how they are better when you use node.js and redis! My favorite part of the presentation was building [Kafka](http://kafka.apache.org/)-in-[redis](http://redis.io/). This included using lua to make an atomic "ReadAndIncr" method to move the shared pointer when reading. You can see the LUA below, and how to use it in the codebase for the project. Supporting Code here: [**evantahler/background\_jobs\_node**](https://github.com/evantahler/background_jobs_node) Thanks! ![](/images/medium-export/1__YGoedPSrUzy2qvbqnTLSdw.png) --- --- url: /blog/post/2016-07-25-building-and-testing-actionhero-plugins.md description: >- Over the past few months, I’ve been working on projects which grew to become ActionHero Plugins. ActionHero is a Node.js framework for… --- ![](/images/medium-export/1__eUn__TuOSlq__8Gx0hPGdK5Q.png) Over the past few months, I’ve been working on projects which grew to become ActionHero Plugins. [ActionHero](http://www.actionherojs.com/) is a [Node.js](https://medium.com/u/96cd9a1fb56) framework for making API servers. ActionHero features a rich plugin system which allows developers to include pre-built tools and packages to extend their servers. Plugins can provide all of the functionality a top-level project can, including actions, background tasks, initializers, and even static files to be served via HTTP. You can learn more about plugins from the [ActionHero documentation](http://www.actionherojs.com/docs/#plugins). Two of the more complex plugins I’ve built recently: * [ah-resque-ui](https://github.com/evantahler/ah-resque-ui): A UI for viewing and working with your tasks/resque from within ActionHero (more [here](https://blog.evantahler.com/actionheros-resque-ui-6b23b049197c#.nhw2dnz23)) * [ah-elasticsearch-orm](https://github.com/messagebot/ah-elasticsearch-orm): An ORM (Object Relational Mapper; Database Driver) for ElasticSearch. This post is to share some of the patterns I’ve come to use when developing and testing these plugins. #### The Basics: Namespacing An ActionHero plugin is really just a collection of normal ActionHero components (tasks, actions, etc) which are injected into the top-level project at runtime. To this end, it’s important to remember that this *injection* can be destructive, and you should namespace \*everything\* to avoid collisions. For example, if I want to have a **status** action in my plugin to report on something specific to the plugin, I really should name that action **plugin:status**. This way, I’ll avoid clobbering something at the top level of the project. This similarly applies to tasks, and even static files. The way actionhero serves static files is that it first looks for the asset, say "/resque/index.html" in the public folders defined in your project directly (via **api.config.general.paths.public**) Then, if it doesn’t find that file, it starts looking in your linked plugins. Here again, I’ve namespaced the assets needed by this plugin via a route prefix to avoid clobbering anything with the top level project. You can see a good example of this in the public folder of [ah-resque-ui](https://github.com/evantahler/ah-resque-ui/tree/master/public/). Finally, plugins can have config files. When you run **actionhero link** to link a new plugin to your project, any files in the plugin’s **config** directory are copied to your top-level project. In this way you can have defaults for the plugin (whatever settings are in the config file to begin with), and the developer including your plugin can modify these easily. Here again, be sure to use a unique namespace as as part of the **api.config** object. #### A Lib Directory and Getting the API object in Scope. Just because an ActionHero plugin is a collection of normal ActionHero components, that doesn’t mean that is \*all\* it has to be. Take a look [ah-elasticsearch-orm](https://github.com/messagebot/ah-elasticsearch-orm). At the end of the day the plugins main job is to expose **api.elasticsearch** to your project, but to do so, we have a robust \*lib\* directory to build up many parts of what that initializer will do. ```js // From ah-elasitcsearch-orm/initializers/ah-elasticsearch-orm.js module.exports = { loadPriority: 100, startPriority: 100, stopPriority: 999, initialize: function(api, next){ var client = require(__dirname + '/../lib/client.js')(api); var search = require(__dirname + '/../lib/aggregate/search.js')(api); var mget = require(__dirname + '/../lib/aggregate/mget.js')(api); var count = require(__dirname + '/../lib/aggregate/count.js')(api); var scroll = require(__dirname + '/../lib/aggregate/scroll.js')(api); // ... ``` Since node makes it easy for us to reference local files (via **\_\_dirname**), we can consider files in this lib to "private" to the plugin, and only what we expose to the api object will be "public". You note that every sub-file within the lib directory is loaded as part of the **initialize** step of the initializer. This is so the API object will be passed in, and we can then subsequently pass it to our sub-files. The **module.exports** for each file in the library exposes a single loader function with accepts the API object to bring it in scope, for example: ```js // from ah-elasitcsearch-orm/lib/client.js var elasticsearch = require("elasticsearch"); module.exports = function (api) { return function () { return new elasticsearch.Client({ hosts: api.config.elasticsearch.urls, log: api.config.elasticsearch.log, }); }; }; ``` In this was, the elasticsearch package itself is private to this file, we can expose only a constructed client, and still read various configuration details from the normal API object. #### Route Injection There is only one special api method ActionHero exposes for use with plugins, and that is **api.routes.registerRoute().** This method allows for route injection of actions you have defined in your middleware. Configuring routes is the job of the top-level ActionHero project, but if your plugin defines many actions, it would be a pain to require the developer using your plugin to add all of your actions to the proper routing table. **api.routes.registerRoute** allows you do this programatically. Again, be sure to namespace your routes! ```js api.routes.registerRoute("get", "/resque/locks", "resque:locks"); ``` You can see a good example of this in [ah-resque-ui’s initializer](https://github.com/evantahler/ah-resque-ui/blob/master/initializers/ah-resque-ui.js). #### Proxy Middleware One interesting challenge when building plugins is dealing with middleware. For example, [ah-resque-ui](https://github.com/evantahler/ah-resque-ui/blob/master/initializers/ah-resque-ui.js) creates some fairly sensitive actions (delete all enqueued tasks, for example). We know that the top-level project should secure these actions, but we have no idea how. Do they have a user + session system? Will they limit access to only a certain IP address? We *can* assume that they will be using an [action middleware](http://www.actionherojs.com/docs/#action-middleware) to enable the protection they need… and we can proxy that in our plugin! ```js // from ah-resque-ui/initializers/ah-resque-ui.js var middleware = { "ah-resque-ui-proxy-middleware": { name: "ah-resque-ui-proxy-middleware", global: false, preProcessor: function (data, callback) { return callback(); }, }, }; if (api.config["ah-resque-ui"].middleware) { var sourceMiddleware = api.actions.middleware[api.config["ah-resque-ui"].middleware]; middleware["ah-resque-ui-proxy-middleware"].preProcessor = sourceMiddleware.preProcessor; middleware["ah-resque-ui-proxy-middleware"].postProcessor = sourceMiddleware.postProcessor; } api.actions.addMiddleware(middleware["ah-resque-ui-proxy-middleware"]); ``` Here you can see that we build up a new middleware here, but it contains a no-op preProcessor. In the [config file generated for this project](https://github.com/evantahler/ah-resque-ui/blob/master/config/ah-resque-ui.js), we ask for the string name of another middleware (**api.config\[‘ah-resque-ui’].middleware**), and if it is defined, we then reference it’s already defined pre and posProcessors. The only trick here is that our load priority must be high enough to ensure that the top-level project’s initilizers have already fired so the original middleware will be in scope. #### Testing with a Real ActionHero Project All good software needs tests, and ActionHero plugins are no exception. However… how do you test something that needs to be required within a larger project to run? Well, we can to just that in our test suite… it’s not that hard! ![](/images/medium-export/1__MOagL7aq2Ht9e__so4tFe5g.png) > [Please look at the specHelper from ah-elasticsearch-orm to see how this is done.](https://github.com/messagebot/ah-elasticsearch-orm/blob/master/test/specHelper.js) To Build a testing server: * The first thing we need to do is install ActionHero in a temp location * Then we link out local plugin into the temp project * And Now we can run the temp ActionHero server with our plugin loaded in! This applies to both testing and developing! When using Mocha to run your tests, you can build a specHelper file which knows how to prepare your test suite, and export it. Then, every subsequent test requires the spec helper meaning that the helper methods you just defined are in scope: ```js var async = require("async"); var should = require("should"); var specHelper = require(__dirname + "/specHelper.js").specHelper; var api; describe("ah-elasticsearch-orm", function () { describe("framework", function () { before(function () { api = specHelper.api; }); it("server booted and normal actions work", function (done) { api.specHelper.runAction("status", function (response) { response.serverInformation.serverName.should.equal( "my_actionhero_project", ); done(); }); }); it("has loaded cluster info", function (done) { should.exist(api.elasticsearch.info.name); var semverParts = api.elasticsearch.info.version.number.split("."); semverParts[0].should.be.aboveOrEqual(2); done(); }); }); }); ``` This means you can write simple tests like the above, use ActionHero’s built in specHelper to run tasks and actions inline… and generally have a good testing experience. As building the temporary project might be slow, you can also add an environment variable to skip that part if you’ve done it once already, IE: **SKIP\_BUILD=true npm test**. Note: You don’t need to require ActionHero as a devDependancy in your **package.json**. And that is how I build & test ActionHero Plugins! ``` ``` --- --- url: /blog/post/2026-05-01-software-for-humans-and-agents.md description: >- I've been building a library and a framework in parallel, and the same design constraint keeps showing up in both: every piece of software now has two audiences, humans and agents. Here's what that changes — for libraries, and for frameworks. --- ![Building software for humans and agents](/images/posts/2026-05-01-software-for-humans-and-agents/image.png) I've been building a few things in my spare time. A library — [macos-ts](https://github.com/evantahler/macos-ts), which gives you typed APIs over your iCloud data (Notes, Messages, Photos, Contacts) and absorbs the SQLite madness so you don't have to. And a framework — [Keryx](https://www.keryxjs.com/), the fullstack TypeScript framework for MCP and APIs: one Action class, five transports, your API is automatically an MCP server, a WebSocket handler, a CLI tool, and a background task runner. I've learned that every piece of software I write now has two audiences: a human, and an agent acting on a human's behalf. They want different things. They forgive different things. They fail differently. And building for both at once changes how you write the code. That sounds like a fluffy thought-leader sentence, so let me make it concrete. ## Libraries: ship MCP next to your SDK Let's start with libraries, because the library is the simpler case. A good library does a few small things on a focused topic — the [Unix principle](https://en.wikipedia.org/wiki/Unix_philosophy). It probably manipulates data. Your app wants that data and now your agent does too. A library used to mean one thing: a typed surface a developer imports into their code, written in a specific language for a specific runtime. If your stack matched, great. If not, you went and found another library. You design it for the engineer reading the docstrings, you write it to be ergonomic from the IDE, you ship it with a README that opens with "Install." That's not enough anymore. Your users are running agents now, and those agents want to do the same things they'd reach for your library to do directly. If your library doesn't show up over MCP, it doesn't show up at all in the workflows that matter most. Half the internet is calling MCP "USB for agents." It's a goofy phrase, but it's basically right. MCP is the universal bus — and the underrated part is that it transcends the language your library was written in. A Python agent can call a TypeScript library. A Rust agent can call a Ruby library. The protocol is the contract; whatever your library is written in is now an implementation detail. The good news: the rules for shipping a good MCP server are the same rules for shipping a good library. Hide complexity. Return errors that explain what to do next. Write documentation that actually documents. The bad news: most existing libraries fail those rules in ways that humans politely tolerate and agents don't. ### Tool descriptions are documentation. Take them seriously. [Arcade's tool description pattern](https://www.arcade.dev/patterns/tool-description) puts it bluntly: > Do not assume the AI model will be able to infer anything that is not explicitly stated in the tool description, even if it's obvious from a human reasoning standpoint. Humans infer. They skim a function signature, glance at the type, click through to a usage example, and figure it out (or fail at compile time). Agents need it spelled out. Prerequisites, related tools, expected formats, when to use this tool versus that one. If you've ever written a really good docstring — the kind a junior engineer can pick up and use without asking questions — that's the bar. Now write every tool description that way. ### Shape your responses for the next call, not just this one Here's the response envelope every macos-ts tool returns. This one's from `list_notes`: ```json { "data": [ { "id": 42, "title": "Shopping List" }, { "id": 17, "title": "Q2 Planning Notes" }, { "id": 9, "title": "Recipes from Mom" } ], "totalResults": 3, "_next": [ { "tool": "read_note", "description": "Read a note's full markdown content" } ] } ``` That `_next` field is the part you'd never put in a normal SDK because a human would find it patronizing. Of course they know what to call next — they have an autocomplete and a documentation tab open. An agent has a tool list and a context window. Telling it "you probably want to call `read_note` after this" is a kindness, not a crutch. It saves a token round-trip and stops the agent from guessing. The errors do the same thing: ```json { "error": "NoteNotFoundError", "message": "Note not found: 999", "category": "not_found", "retryable": false, "recovery": "Use list_notes or search_notes to find valid note IDs." } ``` `retryable` tells the agent whether to back off or try something different. `recovery` tells it *what* something different looks like. An agent cannot recover from a failure it can't read — and "ENOENT: no such file or directory" is not, despite our long affection for it, a readable failure. This idea isn't original. [mcpx](https://github.com/evantahler/mcpx) does the same trick at a different layer — it pre-validates tool inputs locally against the JSON Schema before round-tripping to the server, so the agent gets `missing required field "repo"` instead of an opaque server error fifty milliseconds later. Cheap to write. Saves the agent a confused retry every time. ### The library gets better when you do this Here's the part that surprised me. macos-ts has a downstream consumer — [icloud-backup](https://github.com/evantahler/icloud-backup), a CLI tool that uses the *human* TypeScript API, not MCP. When I added the agent-facing surface (structured envelopes, `_next`, recovery hints), I expected to write more code for less ergonomic returns. The opposite happened. Designing for an agent forced me to be explicit about things I'd been hand-waving for the human caller too — which photos are local versus iCloud-only, which attachments live on disk versus inline, which errors are retryable. The CLI got cleaner because the MCP server forced the conversation. A library that's good for agents is a library that's just *good*. ## Frameworks: MCP is a transport Now zoom out. If a library is the unit of "here's a thing you can call," a framework is the scaffolding for "here's a service that exposes things." The framework's job is to take your business logic and put it in front of users. Plural. For the last decade-plus, "users" meant clients over HTTP, browsers over WebSocket, and operators on the CLI. Maybe a background queue. That was the contract: write your logic once, the framework picks the transport. Same logic on `GET /user`, on `socket.send("user:view")`, on `myapp user:view --id 42`. MCP is a new entry on that list. Not a layer above HTTP. Not a sidecar. A peer. You don't "add MCP support" to your service any more than you "add HTTP support" — you pick the transports your business logic should be reachable over, and the framework wires them up. This is the assumption [Keryx](/blog/post/2026-03-13-announcing-keryx) is built on. A single Action class declares its inputs once with Zod, its middleware once, its `run()` method once, and then the transport configs sit side by side as parallel properties: ```typescript class MyAction implements Action { inputs = z.object({ ... }); middleware = [ ... ]; web = { route: "/thing", method: HTTP_METHOD.PUT }; task = { queue: "default" }; mcp = { tool: true }; async run(params: ActionParams) { ... } } ``` Five transports in one controller. (HTTP, WebSocket, CLI, background tasks, MCP.) The transport is the only thing that changes about a request — its arrival, and its response shape. The validation, the auth, the audit log, the metrics, the error handling: all the same. That's the entire pitch. A few things this reframing buys you, beyond the obvious "write less code": * **MCP isn't just tools.** It's also resources and prompts. A framework should expose all three the way it exposes routes, sockets, and commands — first-class, declared on the action, generated from your existing types. * **OAuth becomes a framework concern, not an app concern.** When agents call your API, "logged in as Evan" is a load-bearing assumption. Your framework's auth needs to mean the same thing across HTTP and MCP, or you've shipped a backdoor. * **llms.txt is the new sitemap.** The framework should generate it for you, the way it generates `sitemap.xml`. Keryx doesn't yet — that's probably the next thing I add. ([llmstxt.org](https://llmstxt.org/) has the spec.) ## What this is really about It's tempting to read all of this as "ship MCP," and stop there. That's not the point. The shift is the audience. "The consumer of your software" used to mean a person at a keyboard, or another piece of code a person wrote. Now it includes an agent acting on someone's behalf — sometimes the same person, sometimes not. That audience needs the same care you'd give a human reader of your README. Clear names. Useful errors. Docs that don't make them guess. The libraries that ship that way will be the libraries that get used. The frameworks that ship that way will be the frameworks that build them. And honestly — having now done both — the work is mostly the work I should have been doing for human readers all along. The agents just don't let me cheat. Onward. *Evan Tahler is Head of Engineering at [Arcade](https://arcade.dev). He's the creator of [Actionhero](https://github.com/actionhero/actionhero), [Keryx](https://www.keryxjs.com/), [macos-ts](https://github.com/evantahler/macos-ts), and [mcpx](https://github.com/evantahler/mcpx).* --- --- url: /blog/post/2021-06-03-distributing-nextjs-via-npm.md description: 'Or, how to make your NPM packages 300mb smaller with this one strange trick!' --- ![Next.js and NPM](/images/posts/2021-06-03-distributing-nextjs-via-npm/210603-npm-nextjs.png) Grouparoo uses [Next.js](https://nextjs.org/) to build our web frontend(s), and we distribute these frontend User Interfaces (UIs) via NPM as packages, e.g. [`@grouparoo/ui-community`](https://www.npmjs.com/package/@grouparoo/ui-community). This allows Grouparoo users to choose which UI they want to use (or none) by changing their `package.json`: **Example `package.json` for a Grouparoo project:** ```json { "author": "Your Name ", "name": "grouparoo-application", "description": "A Grouparoo Deployment", "version": "0.0.1", "dependencies": { "@grouparoo/core": "0.3.3", "@grouparoo/postgres": "0.3.3", "@grouparoo/mailchimp": "0.3.3", "@grouparoo/ui-community": "0.3.3" // <-- Choose UI Package to install }, "scripts": { "start": "cd node_modules/@grouparoo/core && ./bin/start" }, "grouparoo": { "plugins": [ "@grouparoo/postgres", "@grouparoo/mailchimp", "@grouparoo/ui-community" // <-- Choose UI Package to load ] } } ``` Here is how we bundle up our Next.js applications so that our customers can use them out of the box. ## `next build` and `npm run prepare` The first step in “compiling” your Next.js projects is to use the [`next build`](https://nextjs.org/docs/deployment) command. We alias this to the [“prepare” npm lifecyle command](https://docs.npmjs.com/cli/v7/using-npm/scripts#prepare-and-prepublish) so that this command will be run automatically before `npm publish`. In this way we can ensure that we always have a freshly built bundle to use when we publish our packages. This is different from Next’s recommendation to alias `next build` to `npm build` because we are not “deploying” our sites - we are publishing them. Many hosting providers look for a `build` script in your `pacakge.json` to run when the deploy, hence Next.js’ recommendation. ## `.npmignore` vs `.gitignore` The next step in bundling up a Next.js application for deployment via NPM is to include the build files. In all Next.js projects, you want to ignore the `.next` folder in your `.gitignore`. The `.next` folder is where Next.js keeps all the build artifacts it creates — minified javascript, css chunks, etc. Assuming your “source code” is Typescript and SCSS, everything in the `.next` folder should be ignored, and rebuilt as needed from the source. BUT… the content of `.next` is actually what the visitors to your site really load - that’s the HTML, CSS, and Javascript that ends up in the browser. Since we are trying to package up a usable site, we need to bundle the contents of `.next` into our NPM bundles. However, we still want to exclude these rapidly changing files from `git`’s history. The solution is a `.npmignore` file! By default, [NPM will use a `.gitignore`](https://zellwk.com/blog/ignoring-files-from-npm-package/) file to determine which files it packs up into your packages, and which files it ignores. But, you can override this behavior by placing a `.npmignore` in your project. For example: **.gitignore** ``` .DS_Store node_modules .next ``` **.npmignore** ``` .DS_Store node_modules # .next is included ``` ## Skip the `.pack` files The final thing we learned is that while the contents of the `.next` directory are needed for your visitors, not *everything* is needed. We saw that we were shipping [300mb packages to NPM](https://www.npmjs.com/package/@grouparoo/ui-community/v/0.3.1) for our Next.js UIs. We dug into the `.next` folder and learned that if you [opt-into Webpack v5 for your Next.js site](https://nextjs.org/docs/messages/webpack5), large `.next/cache/*.pack` files will be created to speed up how Webpack works. This is normal behavior, but we were inadvertently publishing these large files to NPM! We [added](https://github.com/grouparoo/grouparoo/pull/1807) the `.next/cache/*` directory to our `.npmignore` and our build sizes went down to a more reasonable [20mb](https://www.npmjs.com/package/@grouparoo/ui-community/v/0.3.3). Our final `.npmignore` looks like this: **.npmignore** ``` .DS_Store node_modules .next/cache/* ``` --- --- url: >- /blog/post/2024-10-31-choose-a-database-with-hybrid-vector-search-for-your-ai-applications.md description: RAG needs a database --- ![a robot holding a database](/images/posts/2024-10-31-choose-a-database-with-hybrid-vector-search-for-your-ai-applications/image-1.png) At Airbyte, we see more and more folks building data pipelines to move and prepare data for AI use cases. As to not be too buzzwordy, I’ll define “AI use-cases” for this article as a “RAG” (Retrieval Augmented Generation) application to provide documents to a ‘chat’-like application. In our [previous blog post](https://airbyte.com/blog/not-impressed-with-your-ai-experience-its-not-the-model-its-the-data), we went deep into how these types of applications work, but as a refresher, the goal is to “augment” the question you will be posting to an LLM (like ChatGPT) with additional content you “retrieved”, e.g: ```text You are a helpful in-store assistant. Your #1 goal in life is to help our employees and customers find what they need. Some helpful context is: {{ relevant_documents }} Your task is to answer the question: {{ prompt }} Please don’t make things up. Only return answers based on the context provided in this prompt. ``` As an example, let’s imagine you work for a large grocery store chain, and you want to build a chat bot for your website to help folks find products in the store. If the experiment goes well, you might even install a kiosk in the store that folks can talk to verbally, and maybe even add wayfinding to get them to the right aisle and shelf… but first things first: the website chatbot. A customer might ask “What kind of cheese do you have?” (the original prompt) and the {{relevant\_documents}} should be loaded from the store’s product database for products which are “cheese-adjacent” and are in stock to provide a context-aware answer for the customer. The customer will likely continue the conversation to learn about the various types of cheese, how much each type costs, etc, and eventually select a cheese to buy for their dinner party. So… how do we get that to work? You need a database which can do Hybrid Vector Search! In the dark days of late 2022, there were very few options for a database which could do vector search at all. You needed to choose a specialized tool like [Pinecone](https://www.pinecone.io/) or [Milvus](https://milvus.io/) which had a custom index type which could operate on vectorized content. These databases were great at one thing - loading up the most relevant content for a given query, but they lacked all the other context around that document. As [predicted](https://news.ycombinator.com/item?id=37747534), all databases that can are gaining vector search as database manufacturers race to be relevant in the “AI era”. This is great news for application developers, as there are now many more options available to you! ![a map of databases that work for vector search](/images/posts/2024-10-31-choose-a-database-with-hybrid-vector-search-for-your-ai-applications/image-2.png) > *image from - this image is a year old now, there are at least 2x as many today* For production applications, I posit that you will need a database which can do both vector search and traditional search - also known as **hybrid search**. Consider our shopping example above. Yes, we’ll want to search our product descriptions for the most relevant items for “cheese”, but we likely also want to rank up those items which are in-stock and are available at the closest stores to the user. Assuming that “vector search” means “cosine\_similarity” ([learn more from this great DuckDB article](https://motherduck.com/blog/search-using-duckdb-part-1/)), the query you’ll be executing might look something like this: ```sql -- part 1: rank in-stock products highest SELECT document, '{"in_stock":true}'::JSON as metadata ARRAY_COSINE_SIMILARITY(document_embedded, {query_vector_array}) as score price FROM vector_product_descriptions JOIN product_inventory on product_inventory.product_id = vector_product_descriptions.product_id WHERE store_product_inventory.location < {10_km_from_user} SUM(store_product_inventory.stock_count) > 0 ORDER BY score desc LIMIT 100 UNION -- part 2: include out-of-stock products lower, but still provide them to the LLM context (note how score is subtracted by one) SELECT document, '{"in_stock":false}'::JSON as metadata ARRAY_COSINE_SIMILARITY(document_embedded, {query_vector_array}) - 1 as score price FROM vector_product_descriptions JOIN product_inventory on product_inventory.product_id = vector_product_descriptions.product_id WHERE store_product_inventory.location < {10_km_from_user} SUM(store_product_inventory.stock_count) > 0 ORDER BY score desc LIMIT 100 ``` This query will have the effect of including (up to) the 200 most relevant documents about cheese to the LLM context window, and down-ranking those products that are out-of-stock in nearby stores, but still including them. To build the user experience we want, we needed a blend of both traditional `WHERE` clauses and vector search. As an aside, you might be wondering what query\_vector\_array is in the example above. That’s the array representation of the user’s question, as returned to us by the same transformer that we used to calculate the embeddings of our documents. Check out https://platform.openai.com/docs/guides/embeddings or https://sbert.net to learn more. The need for hybrid search is based on a few design principles: 1. The context and experience for any given user or question will not be the same 2. Documents are static, but the world around them is not 3. LLM context windows are limited, and perform better with smaller context **The context and experience for any given user or question will not be the same.** In this example, every user’s context differed based on what was in stock at the nearest stores to them. Perhaps my local store has brie, but yours is sold out. If the goal is to help customers with a purchasing experience, then your answers should have a lot less brie in them. This is a simple example, but this is an important concept. Perhaps rather than an entirely customer-facing AI application, we are building an HR chatbot to help our employees learn about their healthcare options. Should everyone at the company see the same documents? Are the health plan options in the USA the same as Canada? Building secure and safe AI applications looks a lot like building any other type of software - roles and permissions matter for what folks can see and do, and those are best represented in relational databases. **Documents are static, but the world around them is not.** In our shopping example the document describing our products (e.g. the description and price of our cheeses) never changed. We paid the chunking and embedding cost to build our collection once, and that’s great. However, the other business information around that product (price, stock levels, etc) are always changing. We don’t want to pay the tax of re-embedding the document for the LLM just because someone purchased our last gouda… A traditional JOIN to the inventory table is a much better approach. **LLM context windows are limited, and perform better with smaller context.** LLM context window sizes (e.g. how much text you can give them to analyze) are growing rapidly… but that doesn’t mean you should use it. LLMs are great at sifting through text and pulling out the most relevant items… but they do hallucinate and they do mess up at times. The likelihood of hallucination increases as the context grows. Precision matters, not quantity of the context: > The lack of precise context forces the model to rely more heavily on probabilistic guesses, increasing the chance of hallucinations. Without clear guidance, the model might pull together unrelated pieces of information, creating a plausible response that is factually incorrect. > > So, your goal as an AI application developer is to reduce the context you provide to only the documents most likely to be relevant. In our cheese shopping example, maybe that means 200 documents is too many, and you’ll want to set a lower bound on what relevance scores are included in the document set. Choosing your inclusion threshold is a very application-specific and content-specific tuning activity that is more of an art then a science today… but I’m pretty sure that a `WHERE` clause will help. --- --- url: /blog/post/2017-08-14-cloudflare-and-ssr-react.md description: >- I spent a few hours today sorting out why a new React + Next.JS site was working great locally, but when deployed to production I was… --- I spent a few hours today sorting out why a new [React](https://facebook.github.io/react/) + [Next.JS](https://github.com/zeit/next.js/) site was working great locally, but when deployed to production I was seeing this new type of scary fatal React error: ![](/images/medium-export/1__a__JQaj2nymx62m1RjVuGFA.png) I won’t bore you with all the red-herring rabbit-holes I went down, but eventually I found myself comparing 2 versions of the site deployed as 2 separate Heroku apps. Because I was lazy, only one of them was using with the CloudFlare CDN, as the real production site would be… and Bingo! The non-CDN’d site was working just fine, but the one served behind CloudFlare was having trouble. So what was going on? For those of you with great eyesight, I invite you to compare the source code of both versions of the site: ![](/images/medium-export/1__PNvvfSoSUoOlbTdGHBONFA.png) ![](/images/medium-export/1__j__gR6H2NwUq__sw__MliPDww.png) Can’t Spot the difference? I’ll zoom and enhance: ![](/images/medium-export/1__LBHJ7j__tobd2q9XhcPNDiQ.png) [It turns out](https://stackoverflow.com/questions/38923805/what-is-react-empty) that the way React handles components whose render method return a `null` are drawn as commented-out placeholders like the above. This way, if the state of that component changes and has something to render, we know where in the document it belongs. If that comment is suddenly missing… 💥! If you have a vanilla render-only-in-the-browser app, you probably would never find yourself looking at the source of your initial page, as it will be very empty. However, Next.JS provides both server-side and client side rendering for each page. This means that your initial request to the server will return a `hydrated` version of the page. This is better for SEO, page speed, etc. The trouble happens when you pipe your site though CLoudFlare, and have their auto-minified options on: ![](/images/medium-export/1__lHdd3IlVkdmyQgq1zTT07A.png) These options are normally great… unless the comments in your HTML have semantic value! Which they do here. The solution is simple: ***just turn off auto-minify***. You can probably trust your webpack/babel pipeline to produce efficient HTML, so this extra check is likely not necessary. But… this raises a question: Is it *OK* that React uses comments in this way? Should, perhaps React be using DIVs which are hidden and empty instead? To me, commented code should always be safe to remove… and it looks like CloudFlare thinks so too. --- --- url: /blog/post/2020-05-07-grouparoo-monorepo-deployment.md description: >- Grouparoo leverages the Node.js and NPM ecosystems to manage distribution to our customers. Our open-source software is distributed via the public via NPM --- ## A guide to the Grouparoo Monorepo Automated Release Process ![grouparoo monorepo deployment workflow](/images/posts/2020-05-07-grouparoo-monorepo-deployment/grouparoo-release-process.png) Coming from more traditional web & app development, I’m a big fan of [git-flow](https://www.atlassian.com/git/tutorials/comparing-workflows/gitflow-workflow) style workflow. Specifically the following features: * There are `feature` branches, an integration branch where features are merged together (usually called `main` ), and finally the "live" branch that customers are using (often called `stable` , `release` or `production`) * The `main` branch is always deployable (and should be deployed automatically with a CI/CD tool) * A robust test suite is run against every branch and pull request before deployment Setting up processes and tools to automate and enforce this workflow is possible with tools like [CircleCI](https://circleci.com/), [Github Actions](https://github.com/features/actions), and even [Fastlane](https://fastlane.tools/) + [CodePush](https://microsoft.github.io/code-push/) for mobile apps. However, since Grouparoo is building software that our customers run themselves, what does “pushing to production” really mean? What do automated releases look like? This blog post outlines our processes and the tools we use to automate our deployments and builds. Our 4 major steps are: 1. CI every push 2. Staging Servers 3. NPM Pre-releases 4. NPM Releases ## How do Customers get the Grouparoo Application? Grouparoo leverages the Node.js and NPM ecosystems to manage distribution to our customers. Our open-source software is distributed via the public [NPM repository](https://www.npmjs.com/package/@grouparoo/core), and our paid plugins via NPM Enterprise. This means that all our customers need to do in order to obtain Grouparoo is create a `package.json` and keep it up to date ([more detail here](https://www.grouparoo.com/docs/getting-started)). ```json { "author": "Grouparoo Inc ", "name": "my-grouparoo-project", "description": "A Grouparoo Deployment", "version": "0.1.0", "license": "UNLICENSED", "private": true, "dependencies": { "@grouparoo/core": "latest", "@grouparoo/mysql": "latest", "@grouparoo/postgres": "latest", "@grouparoo/mailchimp": "latest", "@grouparoo/csv": "latest" }, "scripts": { "prepare": "cd node_modules/@grouparoo/core && npm run prepare", "start": "cd node_modules/@grouparoo/core && ./bin/start", "dev": "cd node_modules/@grouparoo/core && ./bin/dev" }, "grouparoo": { "plugins": [ "@grouparoo/mysql", "@grouparoo/postgres", "@grouparoo/mailchimp", "@grouparoo/csv" ] } } ``` This `package.json` will have it’s versions locked in place with npm (or yarn), but can be easily updated via `npm update`, as the newest version of each package requested is `latest` rather than a specific version. ## Continuous Testing for every push ![Continuous Testing all the time](/images/posts/2020-05-07-grouparoo-monorepo-deployment/ci-all-the-time.png) The backbone of any good automated workflow is a robust test suite. You need to be sure that your new code works the way you expect, and hasn’t broken anything. We run our tests on CirleCI, and make use of Jest and man other tools. I’ll talk about our test suite in more detail in a later post, but we have a test suite for every package we publish. The [Grouparoo Monorepo](https://github.com/grouparoo/grouparoo) is a collection of many inter-related packages which we manage together via [Lerna](https://github.com/grouparoo/grouparoo). Lerna helps you keep all of your versions & packages in sync, and more importantly, rely on each-other while developing them! A change in one package might effect the rest, so we test them all in concert. Since Grouparoo is an Open Source project, you can check on the test suite of our `main` branch here: [CircleCI](https://app.circleci.com/pipelines/github/grouparoo/grouparoo?branch=main) At the moment we are: [![Build Status](https://circleci.com/gh/grouparoo/grouparoo.svg?style=svg)](https://circleci.com/gh/grouparoo/www.grouparoo.com) ## Staging Servers Once a `feature` branch has been merged into the `main` branch, we want to immediately deploy it onto a staging server so we can do acceptance testing and share it with our partners. At this step, we use [Heroku’s Github Integration](https://devcenter.heroku.com/articles/github-integration) to deploy our `main` branch on any change, after the tests all pass of course. We use Lerna here to build every project within the monorepo, but running the project within the monorepo has some caveats. Specifically, since Lerna will use symlinks to relate projects within the monorepo to each other, the paths the project sees are not the same as when it will be installed via a normal `npm install`. The app we run on staging looks a lot like our client example above, except that was sprinkle the environment variable `GROUPAROO_MONOREPO_APP` around ([example here](https://github.com/grouparoo/grouparoo/blob/main/apps/staging-community/package.json)). `@grouparoo/core` uses `GROUPAROO_MONOREPO_APP` to change its require paths for its peer dependencies, mainly the other Grouparoo plugins. Rather than `project/node_modules/@grouparoo/core` and `project/node_modules/@grouparoo/plugin` , the runtime within a Lerna project is more like `root/core` and `root/packages/@grouparoo/plugin`. We’ve isolated the majority of plugin loading to [this module](https://github.com/grouparoo/grouparoo/blob/main/core/api/src/utils/pluginDetails.js). In this way, we can closely emulate the experience of installing Grouparoo and related plugins locally without needing to publish every version to NPM. We use a similar paradigm when developing locally. ## NPM Prereleases Once we’ve got our new features deployed on our staging servers, we want to release our NPM packages in a way that our customers can try out. For us, this means a weekly release of our packages every Friday. We once again use Circle CI to run our test suite on a schedule: ```yaml # Run the tests each week + publish test-grouparoo-nightly: triggers: - schedule: cron: "0 0 * * 5" filters: branches: only: - main ``` This mode of running our CI suite include an extra job called “publish”. Assuming again that our tests all pass, the publish command does a few things which you can [see here](https://github.com/grouparoo/grouparoo/blob/main/bin/publish). 1. Use lerna to bump the version of all packages, and use an “alpha” prefix, ie `lerna version prerelease --preid alpha` would yield a version like `v0.1.2-alpha.4`. We create a new git tag for the release and push that to Github 2. Use the[`lerna-changelog`](https://github.com/lerna/lerna-changelog) package to automatically create our release notes from our merged pull requests & push those to Github along with our new git tag 3. Push the new packages to the NPM repository, using the `next` tag. ![npm prerelease](/images/posts/2020-05-07-grouparoo-monorepo-deployment/npm-prerelease.png) There are a number of CI secrets we need to manage access to NPM and Github, but they can all be stored in CircleCI’s secrets management tool. Of note, there is at this time no way to automate (or skip) a 2FA token for publishing to NPM. To overcome this, we’ve created a user who can only publish from CI which doesn’t use 2FA. ### A note on NPM Tags Now, our customers can opt into our alpha releases by changing their dependencies from `latest` to `next` in their `package.json` file. When a normal package is published to NPM, it automatically has the `latest` tag, and that’s what will be installed wit a normal `npm install @grouparoo/core`. However, you can publish your packages to any other tag you want to create parallel distribution channels. ![npm tags](/images/posts/2020-05-07-grouparoo-monorepo-deployment/npm-tags.png) ## NPM Releases The last stage of our release process is to publish the `latest` (read: normal channel) NPM packages. We do this by a having a human make the call that we are ready to do this by merging the release candidate (from `main` or another branch) into the `stable` branch. This will then run the same `publish` CI command as with our prerelease, but with a few changes: 1. Use lerna to bump the version of all packages, and issue a patch-level sever change `lerna version patch` would take our last pre-release version like `v0.1.2-alpha.4` and create `v0.1.3` We create a new git tag for the release and push that to Github 2. Push the new packages to the NPM repository, using the `latest` (normal) tag. 3. Merge these new version changes back into our `main` branch so we are ready for the next round of `alpha` prereleases to start. *** Those are the steps we use to continuously deliver Grouparoo to our customers. We use NPM release tags to regularly publish an `alpha` tagged pre-release every week, and have a human review process for our `latest` stable releases. > The latest version of Grouparoo is just an `npm install` away! --- --- url: /blog/post/2019-01-08-controlling-your-magic-painting-with-your-words.md description: >- Or, how to control your Meural Canvas with the API and connect it to all the things! --- > Or, how to control your\* [Meural](https://www.meural.com/) Canvas with the API and connect it to all the things! For years, I have been threatening to build a "Hogwarts Style" live painting for my apartment. I’ve been following the price of [e-ink full color displays](https://www.eink.com/color-technology.html) for years, and we were getting close to the point where spending a few thousand dollars was almost seaming reasonable... Then, for Christmas, my wife got me a [Meural](https://www.meural.com/). While not e-ink, it uses a nifty collection of ambient light sensors and matte screen coatings to produce a realistic painting effect… and it was under $1,000! ![](/images/medium-export/1__yLVPs76k5HvU5iULe45maw.jpeg) ![](/images/medium-export/1__w17CZTJDosjEqxLxIjefuw.jpeg) While they don’t advertise these features boldly on their website, Meural did a number of things that make their products more compelling than the average IoT device: * **SD Card reader**: Yes, you don’t need to rely on the cloud! * **HDMI Video input**: sure, I guess it’s a TV too! * **Gesture control:** You can wave at it to change images, control settings, and more. * A really well thought-out **API** Yes… there’s an API! I was given Alpha access to the Meural API, and I have to say that it is really pleasant to work with. Also, unlike most IoT companies, they actually thought about security! It’s HTTPS the whole time and every API call needs an authentication token. I spent a few hours creating a node project that lets you send photos from Twitter (by listening to tweets for a specific #hashtag) to your canvas. The example project can be found here: . Note how simple the [Meural class](https://github.com/evantahler/tweet2meural/blob/master/lib/meural.js) is to use. The Meural API (which is **very** subject to change at this point) can be found here: . You will need an account and Meural of course. --- --- url: /blog/post/2024-04-17-cost-conscious-elt-strategies.md description: Saving money while moving data --- ![a big record](/images/posts/2024-04-17-cost-conscious-elt-strategies/image-1.png) Airbyte is already designed to be the most effective and reliable ELT platform, but if you find yourself with a cost-conscious mindset, there is even more you can do to optimize your warehouse spend by adjusting your sync strategy and adding a few more tools and strategies into the mix. Let's discuss some basics about how Airbyte moves your data. On our blog, Airbyte Engineer Edward Gao [writes](https://airbyte.com/blog/how-airbyte-builds-resilient-syncs): > Airbyte has always chosen to do that data processing inside your warehouse, rather than within our own systems. This has a number of advantages. For one, it's faster than doing that massive computation within Airbyte's systems. This lets us focus on what we do best - moving data quickly and reliably - and leverages your warehouse for what it does best. And furthermore, doing [typing and deduping](https://docs.airbyte.com/using-airbyte/core-concepts/typing-deduping) in your warehouse is more private and secure by design - Airbyte never persists your data within our infrastructure, and we only hold onto your data, exclusively in-memory, for the shortest time possible. The largest cost component of a sync is often **deduplication**, which is an optional Airbyte feature supported by many of our database and data warehouse destinations. This is because large amounts of compute and memory are needed to compare all the records in your data warehouse with each other - and the cost scales with the volume of data loaded. Deduplication is especially costly because it requires checking every row in the database, not just the new records, or those records in the active partition/shard/cluster. If you want deduplication, there’s no way of avoiding this cost: either the deduplication activity happens in your sync tool, or the destination which is holding that data. How can you tell if deduplication costs are contributing to your warehouse bill? If you move a consistent amount of data each sync, and your sync time (and Airbyte bill) remain constant, but your warehouse costs are increasing - that’s a strong signal. That shows that the processing time needed is not related to the data moved, but to the total data loaded. So, what can we do to minimize the impact of deduplication on your costs? Here are 4 strategies you can use alongside Airbyte’s append-only sync mode to maximize the cost efficiency of your deduplication efforts. ![a big record](/images/posts/2024-04-17-cost-conscious-elt-strategies/image-2.png) ## Strategy 1: Deduplicate Later Yes, it sounds trite, but do you *really* need [deduplication](https://docs.airbyte.com/using-airbyte/core-concepts/sync-modes/incremental-append-deduped) *as* the data is loaded? Without deduplication, and an [append-only](https://docs.airbyte.com/using-airbyte/core-concepts/sync-modes/incremental-append) sync, your data warehouse will contain multiple entries for the same primary key, as it goes through changes: ![a big record](/images/posts/2024-04-17-cost-conscious-elt-strategies/image-3.png) In this example users table, I signed up on April 16, and changed my name on April 17 (user ID #1). Airbyte always includes metadata about when data was extracted at the start of the ELT pipeline so you can determine which entry is the latest for any given primary key, even if your data doesn’t include a logical cursor natively (e.g. an `updated_at` column in the database, or a CDC cursor). Can your analysis or downstream application make use of these additional columns to pluck the latest entry? Especially if you are using the synced airbyte data to then produce more ergonomic business tables, you can offload deduplication to your transformation step. This helps control costs by decoupling the act of deduplication from syncing - you can sync at a rapid pace, and only deduplicate when it is needed. ## Strategy 2: Deduplicate On Demand Perhaps you do want to deduplicate your data for many downstream consumers, but those consumers only read the table periodically. This is a great use case for a (standard, non-materialized) view that handles deduplication. You can set a rapid replication frequency with your append-only Airbyte sync, but only incur the deduplication cost at read time via the view. For example, this is how you might create a view containing the latest entries for the user table above, with Snowflake syntax: ```sql CREATE VIEW USERS_DEDUPED AS ( SELECT * FROM USERS WHERE _AIRBYTE_RAW_ID IN ( SELECT _AIRBYTE_RAW_ID FROM ( SELECT _AIRBYTE_RAW_ID, row_number() OVER ( PARTITION BY ID ORDER BY _AIRBYTE_EXTRACTED_AT DESC ) as ROW_NUMBER FROM USERS ) WHERE ROW_NUMBER = 1 ) ); ``` ## Strategy 3: Deduplicate Less Often Going up in complexity from the previous strategies, perhaps you have many consumers of the data, and still want deduplication… but lag is OK. You can use a materialized view with a materialization frequency (e.g. once-a-day). This pattern allows you to have both a rapid ingestion frequency, but lock your deduplication cost down with a fixed, single invocation frequency. This decouples the “EL” cost from the “T”, allowing you to tweak the schedules of each independently. And, because you’ll have the loaded data in a non-deduplicated form, you can always do ad-hoc analysis prior to the view re-computing. Many data warehouses allow for scheduled materialization, or you can use a tool like dbt Cloud to run your transformations on a ‘real’ table at a set cadence. ## Strategy 4: Partial Deduplication This final strategy is a mix of any of the previous strategies, with an additional twist - bounding boxes! Perhaps you’ve loaded all of your historic data from a large source, but you know that your use case only requires the data for the past year. In that case, you can filter the data that your database needs to consider for deduplication with an added where clause. For example, if you only want to look at active users this year, you can safely ignore any row that was `updated_at` before January 1st, which means it would also be safe to include `_airbyte_extracted_at` in the filter as well. When possible in the data warehouse, Airbyte automatically sets `_airbyte_extracted_at` as the partition or cluster key, making queries like this efficient. Use these filters when creating your views or business tables, and your database will be able to skip many rows in the deduplication calculation. Conclusion For most Airbyte users, we recommend that you let Airbyte deduplicate your data for you. This will provide the best analysis experience in your data warehouse - your data will be as up-to-date as possible and ready for further analysis, without needing any other tools or orchestration. Airbyte (when deduplication is selected) will perform the loading of the final table, typecasting, and deduplication into a single transaction, periodically throughout the sync. This guarantees that at no point will there be multiple entries for the same row in your final table - preventing stale reads. You also won’t need to manage additional tools, schedules, or SQL to deduplicate your data. That said, for certain cost-sensitive use cases or particularly large volumes of data, you may want to control this deduplication process yourself, or opt-out entirely. The strategies listed here, tools in the modern data stack, and the features data warehouse vendors have built make this possible without too much of a headache. Go forth and sync that data - without breaking the bank! --- --- url: /blog/post/2026-03-13-curl-for-mcp.md description: >- We're building increasingly complex integrations to connect coding agents to MCP servers. But these agents already know how to use the CLI. So why are we teaching them a new interface? --- ![curl for MCP: Why Coding Agents Are Happier Using the CLI](/images/posts/2026-03-13-curl-for-mcp/image1.png) I've been thinking a lot about how coding agents interact with external services. At Arcade, we build agentic tools, so I spend most of my days watching AI agents try to do real things in the real world. And one pattern keeps bugging me. We're building increasingly complex integrations to connect coding agents to MCP servers. Custom SDKs, persistent connections, elaborate client configurations. But here's the thing: these agents already know how to use the CLI. They've been trained on millions of shell sessions. curl, git, docker, npm. They know the patterns. Flags, stdin, stdout, pipes, exit codes it's all in there. So why are we teaching them a new interface? ## Why Not Just Use APIs? I know what some of you are thinking. "Why not skip MCP entirely and just give the agent raw API access?" I've watched agents try this. It breaks in predictable ways. APIs are designed for developers who read documentation, understand authentication flows, and know which endpoints to call in what order. An agent staring at a raw REST API has to figure out: which of these 200 endpoints do I actually need? What's the auth scheme? What are the required headers? How do I paginate? What does error code 422 mean in this specific API's context? That's a lot of inference work before a single useful action happens and every bit of it burns tokens and introduces failure modes. MCP solves this at the protocol level. It gives you a standard way to advertise capabilities, describe schemas, handle authentication, and manage the lifecycle of tool calls. An MCP tool doesn't expose "here are 200 endpoints, good luck." It exposes "here are the 12 things you can do, here's exactly what each one needs, and here's how to authenticate." The agent spends its tokens on the task, not on figuring out the plumbing. Nobody wrote blog posts declaring "REST is dead, just use curl." curl is how you talk to REST from a terminal. mcpx is how you talk to MCP from a terminal. Same relationship. The protocol still matters. The interface is what changed. ## The Best of Both Worlds What if the agent's interface to MCP was just… the CLI? That's the idea behind mcpx. It's a command-line tool that speaks MCP under the hood but presents a shell interface on top. Think curl you don't need to understand HTTP/2 or TLS handshakes to make an API call. You just type a command and get a response. The workflow for an agent looks like this: ```shell # 1. Search for relevant tools across all configured MCP servers mcpx search "create issue" # 2. Inspect a specific tool's schema mcpx info linear create_issue # 3. Execute it mcpx exec linear create_issue '{"title": "Fix the login bug", "priority": "high"}' ``` No persistent connections to manage. No tool schemas bloating the system prompt. The agent discovers what it needs on demand, validates inputs locally before sending, and gets structured output it can parse. ## When CLI, When Remote I want to be honest about where mcpx fits and where it doesn't. This matters and I don't think the current discourse is being precise enough about it. **CLI (mcpx) is built for single-user, single-machine coding agents.** You're a developer. You're running Claude Code, Cursor, Windsurf, or Cline. You want your coding agent to interact with GitHub, Linear, Slack, your database without configuring a custom MCP client for every single tool. Half the MCP clients out there haven't built robust integrations yet, or they're still catching up on auth flows, or their MCP support is "technically works" but not production-ready. mcpx sidesteps all of that. One CLI, one install, every MCP server your agent needs. That's the sweet spot: a developer, their agent, their machine. **Remote MCP (HTTP) is built for multi-user agentic applications.** If you're building something with LangChain, CrewAI, or any framework where multiple users are triggering agents that act on their behalf, you need the full remote MCP flow. Multi-user isolation, per-user OAuth delegation, tenant-scoped permissions, centralized audit trails. The CLI isn't the right interface for that. A proper HTTP-based MCP connection through a gateway is. These aren't competing approaches. They're different interfaces to the same infrastructure: ![CLI vs Remote MCP architecture](/images/posts/2026-03-13-curl-for-mcp/image1.png) Same gateway. Same tools. Same auth. Same audit trail. The only thing that changes is how you connect. mcpx is the left column. If you need the right column, you need a remote MCP and that's the right call. I'm building mcpx because the left column didn't have good tooling. Not because the right column doesn't matter. ## Why This Matters for Token Efficiency There's a practical reason this approach works well for coding agents specifically: tokens are expensive, and context windows are finite. The typical MCP integration loads every available tool's schema into the agent's system prompt. If you've got 50 tools across 5 servers, that's a lot of context window spent on schemas the agent might never use. With mcpx, the agent starts with zero tools in context and progressively discovers what it needs. Search first, inspect second, execute third. You're only paying for what you actually use. And because each call is ephemeral, spawn the process, get the result, done. there's no connection state to manage between turns. The agent's context stays clean. ## Good Tooling Matters If we're going to ask agents to use the CLI for MCP, the tooling needs to be good. Not "technically works" good, actually good. The way curl is good for HTTP, or jq is good for JSON. That means: **Smart output** - human-readable tables in a terminal, JSON when piped to another tool. Auto-detected, no flags needed. **Real debugging** - `mcpx -v` shows you HTTP headers, JSON-RPC messages, and round-trip timing. When something breaks, you can see exactly what happened. Here's what that looks like when running against a production gateway like Arcade's: ``` mcpx -v exec arcade Gmail_WhoAmI > POST https://api.arcade.dev/mcp/evan-coding > authorization: Bearer eyJhbGci... > content-type: application/json < 200 OK (142ms) < x-request-id: abc123 ``` That authorization header isn't a shared API key sitting in a `.env` file. It's a scoped OAuth token - the Gateway handled the auth flow, enforced permissions, and logged the call. I just typed a command. **Search that works** - keyword matching is fine for when you know what you're looking for. But agents often don't. mcpx includes semantic search using a local embedding model (no API key, no network calls) so agents can find tools by describing what they want to accomplish. **Full protocol support** - OAuth for remote servers, async tasks for long-running operations, server-requested input (elicitation), structured logging. The MCP spec is moving fast, and your CLI client needs to keep up. ## Up-to-Date Clients Matter Too This is the part that doesn't get enough attention. MCP is a young protocol, and the spec is evolving quickly. Tasks, elicitation, structured logging - these are all relatively new additions, and they matter for real-world use. mcpx tracks the latest MCP SDK and implements the full spec: stdio and HTTP transports (with automatic Streamable HTTP → SSE fallback), OAuth discovery and token refresh, JSON Schema validation, task management with cancellation, and server-requested input flows. When the spec adds something new, the CLI should support it - otherwise agents are stuck with a partial view of what MCP can do. ## Try It mcpx is open source (MIT) and available as a single binary or via npm: ```shell # Install bun install -g @evantahler/mcpx # or curl -fsSL https://raw.githubusercontent.com/evantahler/mcpx/main/install.sh | bash # Configure a server mcpx add github --url https://mcp.github.com # Start exploring mcpx search "pull request" ``` If you're using Claude Code or Cursor, mcpx ships with built-in agent skills: ```shell mcpx skill install --claude # Claude Code mcpx skill install --cursor # Cursor ``` One command, and your coding agent knows how to discover and use MCP tools on demand, no schema bloat, no persistent connections. I use mcpx against [Arcade's](https://docs.arcade.dev/en/guides/mcp-gateways) gateway daily, that's how I get access to tools across GitHub, Slack, Google Workspace, Linear, and a bunch of other services without configuring each one individually. The gateway handles OAuth and audit logging, so I don't have to think about it. ```shell mcpx add arcade --url https://api.arcade.dev/mcp/engineering-tools mcpx search "send email" ``` If you're building with MCP servers or building MCP servers give it a shot. The iteration speed difference has been significant for me. Source: [github.com/evantahler/mcpx](https://github.com/evantahler/mcpx) *Evan Tahler is Head of Engineering at [Arcade](https://arcade.dev), the only runtime for MCP. He built mcpx because something needed to exist and it didn't.* --- --- url: /blog/post/2012-04-04-google-analytics-curl.md description: >- Whenever I am about to start integrating with a new API, I like to walk though the steps with cURL. For those of you who have been living --- ![](/images/medium-export/1__5h6hxUUaZjLmz5qpH6uPpA.png) Whenever I am about to start integrating with a new API, I like to walk though the steps with cURL. For those of you who have been living under a rock, [cURL is a great command-line tool](http://curl.haxx.se/docs/manpage.html) which can emulate almost every type of action you can make in a browser for testing (GET, POST, PUT, Cookies, Headers, etc). It makes a great prototyping tool for APIs. I was adding Google Analytics support to [nodeChecker](https://github.com/evantahler/nodeChecker) and I was having a hard time understanding their authorization APIs. It appeared that there were 2 options, both oAuth2 and a more programatic approach with an API key. I never got the API key method to work unfortunately, but I was able to access my data without needing to configure/register an application with Google either. I was sad that I couldn’t find any other blogs/guides talking about cURL and the Google Analytics API… Without further delay, here is my **guide to cURL-ing your way into the Google Analytics API**. ### 1) Authenticate ```bash curl https://www.google.com/accounts/ClientLogin --data-urlencode Email= --data-urlencode Passwd= -d service=analytics ``` — data-urlencode takes your variables and endodes them as if they were part of a form POSTED to the url. The existance of that flag also turns your request into a POST request. This will return one response with 3 keys: ```bash SID=AAAAAA \r\n LSID=BBBBBB \r\n Auth=CCCCCC ``` The "Auth" key is the one we need. Think of this like your session cookie for Google Analytics ### 2) Determine your "account id" Subsequent requests to Google Analytics will require you to know your account id for your tracking profile. It is not what you think it is (the UTM-\*) number. To retrive a list of all of your accounts, you can do this: ```bash curl -H "Authorization: GoogleLogin auth=" https://www.googleapis.com/analytics/v3/management/accounts/~all/webproperties" ``` You will now have a list of your URLs and account keys. The -H flag is us setting a custom authorization header with our "cookie" we retrieved above. These IDs won’t change, so you should only need to look this up the first time. Alternatively, you can use [Google’s API playground](http://code.google.com/apis/explorer/#_s=analytics&_v=v3&_m=management.webproperties.list\&accountId=~all) to see all of your accounts. Yes, you have to say "~all" in the accountID field. ### 3) Get Data Getting data can be thought of as an OLAP lookup. You provide a metric (like visits) and a dimension to slice it up by (per day). You can ask for many metrics and dimensions at once, but I’m only going to show you how to do a single request. ```bash curl -H "Authorization: GoogleLogin auth=" https://www.googleapis.com/analytics/v3/data/ga/?ids=ga:&metrics=ga:visits&dimensions=ga:date&start-date=2011-01-01&end-date=2012-02-01" ``` Now that you know the basics, you can use [the API doc](http://code.google.com/apis/analytics/docs/gdata/v3/reference.html) to understand the rest of the parameters. Enjoy! Some important Google Analytics caveats to always keep in mind: * There is a collection delay. Google promises less than a "24 hour" lag between data collection and display on the site (and I’m assuming this applies to the API as well), but I have seen it as bad at 3 days if your site is particular popular * Sessions Expire. You will need to re-authenticate every 24 hours. Or you can be "safe" like I am, and re-authenticate with every request. Non-blocking programming FTW! Well I feel silly now… After posting this, I finally found Google’s own documentation on the topic here: If only there was some sort of way to "search" on the "internet" for relevant information… *Originally published at 04 Apr 2012* --- --- url: /blog/post/2018-06-22-customizable-other-option-in-google-forms.md description: Hi Google! --- Hi Google! Google Forms is a great product. It allows just about anyone (academics, companies, government agencies) to create an online form to do research. It’s free, and it is great! A common use case is capturing demographic data. When asking about gender, an inclusive best practice, according to the [Human Rights Campaign](https://www.hrc.org/resources/collecting-transgender-inclusive-gender-data-in-workplace-and-other-surveys) is to provide the following options: ```text What is your Gender Identity? * Male * Female * Non-binary / Third Gender * Prefer not to say * Prefer to self-describe: ____________ ``` Note how the last option is a free-form text field. One of the important things keep in mind when asking a question about *who you are* is not to alienate anyone. Using the word "other" makes people feel… like *others*. With that in mind, check out how the following question (a radio selection with a free-text option) is displayed in a Google Form: ![](/images/medium-export/1__3rOVI36gT6fubqDUaEflhw.png) It would be great if Google Forms allowed the customization of the label on the final, free-text, entry field. Thanks! --- --- url: /blog/post/2018-11-01-data-science-vs-analytics.md description: >- Data Science vs Analytics are two related disciplines in most startups. I’ve also heard a number of ways to describe the distinctions… --- ![](/images/medium-export/1__1LI9TzwDU1l6IyJFBRcULw.jpeg) Data Science vs Analytics are two related disciplines in most modern companies. I’ve heard a number of ways to describe the distinctions between the two, but they rarely have consensus. Here are the definitions I’ve synthesized over my career: **Analytics:** The practice of analyzing data to explain why a behavior took place. **Data Science**: The practice of using data to influence behavior. So, say you worked for a streaming video company: The **Analytics** team would explain why, all of a sudden, old re-runs of Frasier are very popular. The **Data Science** team would write the "*… you might also like*" part of the application to indicate that because you liked other 90’s sitcoms, you might also like Frasier. --- --- url: /blog/post/2021-01-21-defering-side-effects-in-node.md description: >- Using AsyncHooks, you can collect side-effects within a database transaction and only run them if the transaction succeeds --- ![A glowing laptop](/images/posts/2021-01-21-defering-side-effects-in-node/laptop.jpeg) At Grouparoo, we use [Actionhero](https://www.actionherojs.com) as our Node.js API server and [Sequelize](https://sequelize.org) for our Object Relational Mapping (ORM) tool - making it easy to work with complex records from our database. Within our Actions and Tasks, we often want to treat the whole execution as a single [database transaction](https://stackoverflow.com/questions/974596/what-is-a-database-transaction) - either all the modifications to the database will succeed or fail as a unit. This is really helpful when a single activity may create or modify many database rows. ## Why do we need Transactions? Take the following example from a prototypical blogging site. When a user is created (`POST /api/v1/user`), we also create their first post and send them a welcome email. All examples in this post are written in Typescript, but the concepts work the same for Javascript. ```ts import { action } from "actionhero"; import { User, Post } from "../models"; export class UserCreate extends Action { constructor() { super(); this.name = "user:create"; this.description = "create a user and their first post"; this.inputs = { firstName: { required: true }, lastName: { required: true }, password: { required: true }, email: { required: true }, }; } async run({ params }) { const user = await User.create(params); await user.updatePassword(params.password); await user.sendWelcomeEmail(); const post = await Post.create({ userId: user.id, title: "My First Post", published: false, }); return { userId: user.id, postId: post.id }; } } ``` In this example, we: 1. Create the user record 2. Update the user’s password 3. Send the welcome email 4. Create the first post for the new user 5. Return the IDs of the new records created This works as long as nothing fails mid-action. What if we couldn’t update the user’s password? The new user record would still be in our database, and we would need a try/catch to clean up the data. If not, when the user tries to sign up again, they would have trouble as there would already be a record in the database for their email address. To solve this cleanup problem, you could use transactions. Using [Sequelize’s Managed Transactions](https://sequelize.org/master/manual/transactions.html), the run method of the Action could be: ```ts async run({ params }) { return sequelize.transaction(async (t) => { const user = await User.create(params, {transaction: t}); await user.updatePassword(params.password, {transaction: t} ); await user.sendWelcomeEmail(); const post = await Post.create({ userId: user.id, title: 'My First Post', published: false, }, {transaction: t}) return { userId: user.id, postId: post.id }; }) } ``` Managed Transactions in Sequelize are very helpful - you don’t need to worry about rolling back the transaction if something goes wrong! If there’s an error `throw`-n, it will rollback the whole transaction automatically. While this is safer than the first attempt, there are still some problems: 1. We have to remember to pass the `transaction` object to *every* Sequelize call 2. We need to ensure that every method we call which *could* read or write to the database needs to use the transaction as well, and take it as an argument (like `user.updatePassword()`... that probably needs to write to the database, right?) 3. Sending the welcome email is not transaction safe. Sending the email as-written will happen even if we roll back the transaction because of an error when creating the new post… which isn’t great if the user record wasn’t committed! So what do we do? ## Automatically Pass Transactions to all Queries: CLS-Hooked The solution to our problem comes from a wonderful package called [`cls-hooked`](https://github.com/Jeff-Lewis/cls-hooked). Using the magic of [`AsyncHooks`](https://github.com/nodejs/node/blob/master/doc/api/async_hooks.md), this package can tell when certain code is *within* a callback chain or promise. In this way, you can say: "for all methods invoked within this async function, I want to keep this variable in scope". This is pretty wild! If you opt into using Sequelize with CLS-Hooked, *every* SQL statement will check to see if there is already a transaction in scope... You don't need to manually supply it as an argument! From the `cls-hooked` readme: > CLS: "Continuation-Local Storage" > Continuation-local storage works like thread-local storage in threaded programming, but is based on chains of Node-style callbacks instead of threads. There is a performance penalty for using `cls-hooked`, but in our testing, it isn’t meaningful when compared to `await`-ing SQL results from a remote database. Using `cls-hooked`, our Action's run method now can look like this: ```ts // Elsewhere in the Project const cls = require('cls-hooked'); const namespace = cls.createNamespace('actionhero') const Sequelize = require('sequelize'); Sequelize.useCLS(namespace); new Sequelize(....); // Our Run Method async run({ params }) { return sequelize.transaction(async (t) => { const user = await User.create(params); await user.updatePassword(params.password); await user.sendWelcomeEmail(); const post = await Post.create({ userId: user.id, title: 'My First Post', published: false, }) return { userId: user.id, postId: post.id }; }) } ``` Ok! We have removed the need to pass `transaction` to all queries and sub-methods! All that remains now is the `user.sendWelcomeEmail()` side-effect. How can we delay this method until the end of the transaction? ## CLS and Deferred Execution Looking deeper into how `cls-hooked` works, we can see that it is possible to tell if you are currently in a namespace, and to set and get values from the namespace. Think of this like a session... but for the callback or promise your code is within! With this in mind, we can write our run method to be **transaction-aware**. This means that we can use a pattern that knows to run a function in-line if we aren’t within a transaction, but if we are, defer it until the end. We’ve wrapped utilities to do this within [Grouparoo’s CLS module](https://github.com/grouparoo/grouparoo/blob/main/core/src/modules/cls.ts). With the CLS module you can write code like this: ```ts // from the Grouparoo Test Suite: Within Transaction test("in a transaction, deferred jobs will be run afterwords", async () => { const results = []; const runner = async () => { await CLS.afterCommit(() => results.push("side-effect-1")); await CLS.afterCommit(() => results.push("side-effect-2")); results.push("in-line"); }; await CLS.wrap(() => runner()); expect(results).toEqual(["in-line", "side-effect-1", "side-effect-2"]); }); ``` You can see here that once you `CLS.wrap()` an `async` function, you can defer the execution of anything wrapped with `CLS.afterCommit()` until the transaction is complete. The order of the `afterCommit` side-effects is deterministic, and `in-line` happens first. You can also take the same code and choose not apply `CLS.wrap()` to it to see that it still works, but the order of the side-effects has changed: ```ts // from the Grouparoo Test Suite: Without Transaction test("without a transaction, deferred jobs will be run in-line", async () => { const results = []; const runner = async () => { await CLS.afterCommit(() => results.push("side-effect-1")); await CLS.afterCommit(() => results.push("side-effect-2")); results.push("in-line"); }; await runner(); expect(results).toEqual(["side-effect-1", "side-effect-2", "in-line"]); }); ``` ## CLSAction and CLSTask Now that it is possible to take arbitrary functions and delay their execution until the transaction is complete, we can use these techniques to make a new type of Action and Task that has this functionality built in. We call these [`CLSAction`](https://github.com/grouparoo/grouparoo/blob/main/core/src/classes/actions/clsAction.ts) and [`CLSTask`](https://github.com/grouparoo/grouparoo/blob/main/core/src/classes/tasks/clsTask.ts). These new classes extend the regular Actionhero Action and Task classes, but provide a new `runWithinTransaction` method to replace `run`, which helpfully already uses `CLS.wrap()`. This makes it very easy for us to opt-into an Action in which automatically runs within a Sequelize transaction, and can defer it's own side-effects! Putting everything together, our new transaction-safe Action looks like this: ```ts // *** Define the CLSAction Class *** import { Action } from "actionhero"; import { CLS } from "../modules/cls"; export abstract class CLSAction extends Action { constructor() { super(); } async run(data) { return CLS.wrap(async () => await this.runWithinTransaction(data)); } abstract runWithinTransaction(data): Promise; } ``` ```ts // *** Use the CLSAction Class *** import { CLSAction } from "../classes"; import { User, Post } from "../models"; export class UserCreate extends CLSAction { constructor() { super(); this.name = "user:create"; this.description = "create a user and their first post"; this.inputs = { firstName: { required: true }, lastName: { required: true }, password: { required: true }, email: { required: true }, }; } async runWithinTransaction({ params }) { const user = await User.create(params); await user.updatePassword(params.password); await CLS.afterCommit(user.sendWelcomeEmail); const post = await Post.create({ userId: user.id, title: "My First Post", published: false, }); return { userId: user.id, postId: post.id }; } } ``` If the transaction fails, the email won’t be sent, and all the models will rolled back. There won't be any mess to clean up 🧹! ## Summary The `cls-hooked` module is a very powerful tool. If applied globally, it unlocks the ability to produce side-effects throughout your application worry-free. Perhaps your models need to enqueue a Task every time they are created... now you can if you `cls.wrap()` it! You can be sure that task won’t be enqueued unless the model was really saved and committed. This unlocks powerful tools that you can use with confidence. --- --- url: /blog/post/2012-12-20-delete-old-git-branches.md description: Keep your git branches clean! --- ![](/images/medium-export/1__HArdB4tMciWUivKfh4XWxg.jpeg) It was time to clean up some old git branches at TaskRabbit today. It turned out that we had hundreds of branches that were "old", and could be removed. What do I mean by "old"? As with many things, coming up with the proper definition is 1/2 the battle. At the end of the day, "old" meant "**I have been merged into master, and contain no un-merged code**" (where master is your integration branch). When phrased this way, there are some systematic and simple ways to due some git pruning. Here’s a simple rake task: ```ruby namespace :git do desc "delete remote branches which have already been merged into master" task :clean_merged_branches do local_branches = `git branch`.split("\n").map{ |line| line.gsub(" ","") } raise "You need to be in master to start this game" unless local_branches.include?('* master') say_and_do("git fetch --prune") #clean up your local and remove deleted branches bad_branches = `git branch -r --merged`.split("\n").map{ |line| line.gsub(" ","") } bad_branches.each do |bad_branch| parts = bad_branch.split("/") remote = parts.shift if remote == "origin" branch = parts.join("/") next if branch =~ /^HEAD.*/ next if branch == /^refs\/.*/ next if branch == "master" next if branch == /.*staging.*/ next if branch == /.*production.*/ say_and_do("git branch -D #{branch}") if local_branches.include?(branch) say_and_do("git push origin :#{branch}") else puts "skipping #{bad_branch} because it doesn't have a remote of 'origin'" end end end end def say_and_do(stuff, explanation=nil) puts explanation if explanation puts " > #{stuff}" `#{stuff}` end ``` The trick here is git fetch -r -merged command which does exactly what we want: tell me about the remote branches which have all been merged into my current branch, master. We simply collect those branches, and delete them locally (if they exist) and on origin. The logic goes like this 1. Ensure I am in the master branch 2. git fetch -prune (clean up my local branch list according to remote’s list) 3. git fetch -r -merged (show me the branches which have been merged into the integration branch) 4. loop through those branches and delete them locally and remotely Two other points of note: 1. It’s very likely that you will have some staging, test, and production branches which are either equivalent to or slightly behind your integration branch. You probably want to explicitly ignore those 2. If you have more than one remote branch setup (perhaps heroku for deployment or tddium for testing), you want to be sure to ignore any branch which isn’t from "origin" --- --- url: /blog/post/2012-04-04-deploying-node-apps-with-capistrano.md description: I really like Capistrano. --- ![](/images/medium-export/1__GOljYLFswSUI6EC7NCFkLg.jpeg) I really like [Capistrano](https://github.com/capistrano/capistrano). Capistrano is a deployment gem for Ruby which helps you manage migrations, deployments, applications, etc on all of your many servers. It excels at keeping many servers in sync and managing complex multi-application deployments. If you are a ruby-on-rails developer, you probably already use Capistrano. I have been working on node.js applications lately, and I was missing the simplicity of Capistrano deployments, so I wondered how hard it would be to use Capistrano to manage a node.js application. It turns out that it is fairly easy! This guide assumes that you already have your server set up (node installed, npm installed, database installed, etc), but you can of course manage this with Capistrano as well. An important note is that you do not need to install ruby on the servers you are deploying to, just your local development environment. I’m going to assume you also have a version of Ruby installed (ships with all modern OSX versions). Step one is installing Capistrano (gem install capistrano). Gems are like npm packages if you aren’t familiar with ruby. You may need to be root to do this (sudo gem install capistrano), but I recommend using rbenv a great ruby version manager. To get rbenv up and running, here are the **tl;dr** steps (I’ll write a longer post on rbenv vs rvm in the future, but this is the "least scary" way to have custom versions of ruby installed at the user-level) * Install rbenv for OSX: * ( Homebrew: ) * Install rbenv brew install rbenv * Install ruby-versions brew install ruby-build * Install a ruby version * Set your global rbenv version * Set up some handy bash aliases for running ruby apps: * export PATH="$HOME/.rbenv/bin:$PATH " * eval "$(rbenv init -)" * alias r="rbenv exec " * alias rb="rbenv exec bundle exec " * exec $SHELL * Run your apps with rbenv * r gem install bundler * r bundle install Now, rather than worrying about being root or messing with your system-level ruby and gems, you can just run r and then your command in ruby user space! I’m using r as shorthand for rbenv exec, as setup in my bash configuration steps above. OK, we have Capistrano installed, now we need to set up a few files in our node project. Luckily this is as simple as typing r capify . in the project directory’s root. The capify command will set up a few files and folders in your project. You don’t need to worry about Capfile (which is used by the Capistrano command later to initialize itself), but ./config/deploy.rb is where we will be building our custom deployment setps. The one tricky thing with node.js applications is that by default they expect to be run within a console. This means that it can be hard to daemonize them (running head-less), which is what you want on a production web server. However, the fantastic package Forever does exactly this for us. Forever monitors and logs your application, as well as creating handy start and stop wrappers for your project. Ensure that you have forever installed (npm install forever) and that it is listed in your package.json. Forever can be installed globally (so you have handy access to the ‘forever’ command), but I’ll be using it locally to minimize the chance of conflicts. So now we have Capistrano and Forever installed. We need to create some new deployment tasks in ./config/deploy.rb to tell Capistrano exactly what we want it to do when we deploy. Here is my skeleton deploy.rb: ```ruby set :application, "MY_APPLICATION" set :repository, "git@github.com:PATH_TO_MY_REPO" set :scm, :git set :use_sudo, false set :keep_releases, 5 set :deploy_via, :remote_cache set :main_js, "MAIN_APP.js" desc "Setup the Demo Env" task :demo do set :branch, 'develop' set :domain, 'MY DEMO SERVER' set :user, 'MY SSH USER' set :applicationdir, "/home/#{user}/deploy/#{application}" set :deploy_to, applicationdir ssh_options[:keys] = ["/path/to/my/ssh.pub"] server 'MY DEMO SERVER', :app, :web, :db, :primary => true end desc "Setup the Production Env" task :production do set :branch, 'master' set :domain, 'MY PROD SERVER' set :user, 'MY SSH USER' set :applicationdir, "/home/#{user}/deploy/#{application}" set :deploy_to, applicationdir server 'MY PROD SERVER', :app, :web, :db, :primary => true end namespace :deploy do before 'deploy:start', 'deploy:npm_install' before 'deploy:restart', 'deploy:npm_install' # before 'deploy:default', 'deploy:setup' after 'deploy:create_symlink', 'deploy:symlink_node_folders' after 'deploy:setup', 'deploy:node_additional_setup' desc "START the servers" task :start, :roles => :app, :except => { :no_release => true } do run "cd #{applicationdir}/current/ && node_modules/.bin/forever start #{main_js}" end desc "STOP the servers" task :stop, :roles => :app, :except => { :no_release => true } do run "cd #{applicationdir}/current/ && node_modules/.bin/forever stop #{main_js}" end desc "RESTART the servers" task :restart, :roles => :app, :except => { :no_release => true } do run "cd #{applicationdir}/current/ && node_modules/.bin/forever restart #{main_js}" end task :symlink_node_folders, :roles => :app, :except => { :no_release => true } do run "ln -s #{applicationdir}/shared/node_modules #{applicationdir}/current/node_modules" end task :node_additional_setup, :roles => :app, :except => { :no_release => true } do run "mkdir -p #{applicationdir}/shared/node_modules" end task :npm_install, :roles => :app, :except => { :no_release => true } do run "cd #{applicationdir}/current/ && npm install" end task :npm_update, :roles => :app, :except => { :no_release => true } do run "cd #{applicationdir}/current/ && npm update" end end task :tail do resp = capture "cd #{applicationdir}/current/ && node_modules/.bin/forever logs | grep #{main_js}" log = resp.split(" ").last log.gsub!("\e[35m", "") log.gsub!("\e[39m", "") run "tail -f #{log}" en ``` You can see that we have custom start, stop, and restart commands which [Capistrano’s normal deployment tasks](https://github.com/capistrano/capistrano/wiki/2.x-Default-Deployment-Behaviour) will use. You can also of course call these tasks directly if you want to restart your servers. ![](/images/medium-export/0__qhxMTgdQNu__h9Icv.jpg) Other than the wrappers for Forever, I also chose to symlink my node\_modules directory to a common place. This will allow me to share my packages between deploys (which are all deployed in separate folders, but symlinked to "current") which will make subsequent deployments really fast. Now you can use Capistrano very simply to check out new code from GitHub, make a new folder to hold the new code, run any npm updates, stop the old server, and start a new one with one simple command: r cap demo deploy. You will note that we have tasks called "demo" and "production". These tasks just set variables (like git branch names and a list of servers) so that subsequent commands can use them. Commands are run in sequence in Capistrano. I also created one more command called "tail" which is a quick way for me to tail the output of my application which Forever is smart enough to capture and store in a log file for me. To monitor my demo server, it would be "r cap demo tail". This will run until I close it. For my node.js deployments, I have node.js listen on port 8080 and use haproxy to route that to port 80. Other folks like to use nginx and sockets to route their app to the normal web ports. Either way, it is best to NOT run your node app as root (which is required if you want to listen on port 80). If you do want to do this, you will need to use Capistrano’s try\_sudo command to run the start/restart scripts as root. The power of Capistrano becomes even more obvious when you have many servers. Just keep adding ‘server’ with distinct roles. You will note that tasks you create will only be executed on servers with specific roles, so you can carefully manage your deployment. There are many grate Capistrano resources out there (including many awesome extensions for controlling EC2 servers, logging, etc) and now you can use them all with your node.js apps! *Originally published at 04 Apr 2012* --- --- url: /blog/post/2016-10-29-deploying-from-flynn-to-travisci.md description: >- In my previous post, I shared some tips on creating a production Flynn cluster. If you don’t know, Flynn is a an open-source self-hosted… --- In my previous post, I shared some tips on creating a production Flynn cluster. If you don’t know, [Flynn](https://flynn.io) is a an open-source self-hosted Heroku replacement (PASS). In this article, I’ll share some tips on how to set up continuous integration using [Travis.CI](https://travis-ci.org/) and you Flynn server. This project assumes you already have your project’s test suite running automatically on Travis.CI. ![](/images/medium-export/1__yZ45b6I1QGRBwWtQqNqBiA.png) To understand how this process works, you will need to understand Flynn’s git security model. When you install the Flynn client on your development machine and connect to your cluster, Flynn stores a cluster key in a file ( `~/.flynnrc` ) which is used to authenticate to the server. All Flynn users on your cluster share this key. Flynn also has a helper which stores this key in your OSX Keychain (or equivalent system-level keystore) so that when you git push, the credentials will be used. ![](/images/medium-export/1__xTWBVG3tTd7VgXSFakEM7g.png) However, you don’t *need* to use this system-level keystore if you **configure your git remote to include the username and password** directly. There is a secure way to do this on Travis! You also **do not need** the Flynn CLI tool. First, you’ll need the Travis ruby gem installed: ```bash gem install travis ``` Then, you’ll be using the gem to encrypt your Flynn cluster Key (which you can find in `~/.flynnrc`. Travis lets you store encrypted data within your project’s `.travis.yml` configuration file. This allows you to share your project with no worry that anyone other than the Travis severs can deploy to your cluster. ```bash travis encrypt FLYNN_KEY=XXXXXXX ``` Now take that encrypted output and add it to your project’s `.travis.yml` under the `secure` directive. Next, we’ll use the Travis’ `after-success` lifecycle event to tell Travis to push our now-tested branch to the server: ```yml after_success: - git pull - git remote add flynn [https://user:$FLYNN_KEY@git.site.com/myapp.git](https://user:$FLYNN_KEY@git.messagebot.io/www.git) - git push flynn master ``` A complete file for a [Node.js](https://medium.com/u/96cd9a1fb56) project would look like: ```yml sudo: false language: node_js node_js: - "6" env: - secure: "YYYYYYYYYYYYYYY=" after_success: - git pull - git remote add flynn https://user:$FLYNN_KEY@git.server.com/myapp.git || true - git push flynn master script: npm run test ``` And that’s it! Travis will now deploy to Flynn in a secure way after every green build. ### Advanced Topics: **Deploying only after all steps in a build matrix have complete** With Travis, it is possible to run your test suite a few times with separate configurations. Perhaps you want to run a test once with MySQL and once with Postgres… you can! Travis calls this feature the [Build Matrix](https://docs.travis-ci.com/user/customizing-the-build). You can configure separate collections of environment variables to control your test behavior. However, there is no built-in way for Travis to expose what would amount to an `after-build-matrix-success` directive in the configuration `.travis.yml`. Luckily, someone has solved this problem for us: `[travis-after-all](https://github.com/alrra/travis-after-all)`. T[ravis-after-all](https://github.com/alrra/travis-after-all) is a cool little [Node.js](https://medium.com/u/96cd9a1fb56) package which polls Travis’ internal APIs to tell when all parts of your build matrix have complete, and then run your deploy script once-and-only-once. Modify your deployment scripts in the ways described by the project’s Readme, and you should be good to go! **Custom Branches = Custom Deployments** You may only want to deploy to your Flynn cluster when certain branches are tested. Perhaps `master` should be deployed to `staging.site.com` and `production` should be deployed to `www.site.com`. When testing on Travis, you have a few [environment variables](https://docs.travis-ci.com/user/environment-variables/#Default-Environment-Variables) exposed to you, such as: `TRAVIS_BRANCH` which you can use to make decisions about how to deploy. The example I described above would look like: ```yml sudo: false language: node_js node_js: - "6" env: - secure: "xxxxxxxxxx=" after_success: - | FLYNN_APP="" if [ "$TRAVIS_BRANCH" = "master" ]; then FLYNN_APP="www-staging"; fi if [ "$TRAVIS_BRANCH" = "production" ]; then FLYNN_APP="www"; fi if [ "$FLYNN_APP" = "" ]; then echo "skipping branch $TRAVIS_BRANCH" else git pull git remote add flynn https://user:$FLYNN_KEY@git.site.com/$FLYNN_APP.git git push flynn master fi script: npm run test ``` (why yes, you *can* write bash directly in your `.travis.yml` file!) --- --- url: /blog/post/2025-07-23-designing-sql-tools-for-ai-agents.md description: >- Build production SQL tools for LLM agents with proper auth, least-privilege access, and injection protection. PostgreSQL examples included. --- ![Additional workflow example showing SQL tool implementation](/images/posts/2025-07-23-designing-sql-tools-for-ai-agents/additional-workflow.png) One of the most popular use-cases for AI/LLM Agents is exploring and activating data in your SQL databases and warehouses. How can you build safe and reliable agents? Let's explore the space. ## Start with Boundaries, Not Prompts Prompting tells the model what you want, not what it's allowed to do. Think of a prompt like an *intent* - helpful for UX, but never a security guarantee. Real enforcement lives a layer deeper: in the database engine itself or a narrowly scoped service tier you fully control. To get this right, you will need: 1. **Purpose‑built roles** - create one DB role per toolkit so your permission story is self‑documenting. 2. **Limit surface area** * **Access** - If your agent does not need write access, do not grant it. * **Tables** - expose only the ones the agent genuinely needs. * **Columns** - omit ssn, password\_hash, and other PII entirely. * **Rows** - enable Row‑Level Security (RLS) so agents see only their slice of data. 3. **Pin connections to those roles** - use a connection pool that the LLM cannot access, via a remote tool server. Never grant the agent the ability to modify this with commands like SET ROLE. Furthermore, to speed up your agent, keep a connection pool alive that was created when the agent booted up, and don't let the agent change it. **PostgreSQL example:** ```sql ALTER TABLE orders ENABLE ROW LEVEL SECURITY; CREATE POLICY region_policy ON orders USING (region = current_setting('app.current_region')); GRANT SELECT (id, customer_id, total_cents, region) ON orders TO ai_reporting; ``` Other engines have similar primitives: Snowflake's **Secure Views**, BigQuery's **Authorized Views**, and SQL Server's **Dynamic Data Masking** all let you enforce the same principle—*least privilege by default, enforced where the data lives*. Creating AI tools (e.g. MCP servers) with overly permissive access is why recent security issues have occurred ([e.g.](https://news.ycombinator.com/item?id=44502318)). Don't do that! When building out your tool, consider how the permissions of the database account and the roles of your end-users relate. There are a few options: 1. **Single User:** The agent is single-user, and the agent's database access is exactly the same as the user's access. In this use case, storing the connection string for the database as an environment variable would work. 2. **Single Role:** The agent is multi-user, and the agent's database access is exactly the same for all users of the agent. E.g. the "finance agent" will only be used by members of the finance department, or the "analytics agent" has access to tables that the whole company can see. Once again, storing the connection string for the database as a global environment is appropriate. 3. **Heterogeneous Roles**: Multiple types of users will be using agents, each with different access. In this case, your agent will need to manage multiple connections for each type of user. Perhaps each user is required to store their own connection string as a secret, or you'll be looking up each user's permissions in a system like [DreamFactory](https://www.dreamfactory.com/) or a similar entitlement server. ## Operational or Exploratory Tools? It is important to differentiate between these two primary types of SQL tools for AI agents, as their design and security considerations vary. The two types of SQL tools, broadly speaking, can be classified as "Operational" and "Exploratory". Operational tools have very clearly defined use-cases, and can even safely modify data. Exploratory tools are for data exploration and reporting, and need to support unknown use-cases and expansive schemas. ## Operational Tools: Precision and Control Operational SQL tools are designed for specific, often transactional, interactions with the database. They are typically used for tasks that involve data modification (inserts, updates, and even deletes) or highly structured data retrieval. The emphasis here is on precision, predictability, and safety. Agents are *more* susceptible to [SQL injection attacks](https://owasp.org/www-community/attacks/SQL_Injection) than traditional software due to an added layer of interpretation if you let the LLM write any part of the query. You need to guard against attacks just like any other software application interacting with a database. You shouldn't be letting your LLM write SQL statements whenever possible. What you should be doing instead is building Operational Tools with the following properties: * **Prepared Statements:** Always utilize prepared SQL statements to prevent SQL injection vulnerabilities. The AI agent should provide parameters that are then bound to the pre-compiled statement, rather than constructing raw SQL queries. * **Specific Methods and Input Validation:** Each operational tool should expose clearly defined methods (functions) that the AI agent can call. These methods must include robust input validation, ensuring that all incoming data conforms to expected types, formats, and ranges. * **Enumerate Allowed Values:** For fields with a limited set of valid inputs, use enums or lookup tables to restrict the AI agent's choices. This prevents the generation of invalid or malicious data. * **Least Privilege:** As previously discussed, these tools must operate with the absolute minimum database permissions required. ### Examples of Operational Tools Operational tools have type descriptions like: ```python # An example of a typed operational read tool @tool(requires_secrets=["DATABASE_CONNECTION_STRING"]) async def count_new_users_of_app( context: ToolContext, aggregation_window: Annotated[AggregationWindowEnum, "The time range to group new users by. Default: 'day'"] = AggregationWindow.day, # Arcade will expand the options of AggregationWindowEnum into the prompt automatically exclude_internal_users: Annotated[bool, "Should we ignore internal users? Default: True"] = True, limit: Annotated[int, "The number of rows to return. Default: 100"] = 100, ) -> int: """ Count the number of new users within a time window """ ``` Which produces a query like: ```sql SELECT date_trunc($1, created_at) AS period, COUNT(*) AS user_count FROM users WHERE internal = false GROUP BY period ORDER BY period DESC LIMIT $2 ``` And ```python # An example of a safe tool which would modify the database @tool(requires_secrets=["DATABASE_CONNECTION_STRING"]) async def update_user_payment_plan( context: ToolContext, user_id: Annotated[str, "The user ID to modify"], payment_plan: Annotated[PaymentPlanEnum, "The payment plan for the user"], # Arcade will expand the options of PaymentPlanEnum into the prompt automatically ) -> PaymentPlanEnum: """ Update the payment plan for a specific user. """ ``` Which produces a query like: ```sql UPDATE users SET payment_plan = $1 WHERE id = $2 LIMIT 1 ``` Note the ToolContext argument in the examples above. That's Arcade's way of passing secrets and user information into the tool without having it go through the (unsafe) LLM. You never want something as sensitive as a database connection string or password to be available to the LLM - it might display it, leak it, or worse, train on it. Learn more about ToolContext [here](https://docs.arcade.dev/home/build-tools/tool-context). ## Exploratory Tools: Exploration and Insights Exploratory SQL tools, in contrast, are designed for querying and extracting insights from data. The most common use case is to enable internal users within your organization to access your data warehouse. A data warehouse is a massive database (commonly [Snowflake](https://www.snowflake.com/en/) or [Databricks](https://www.databricks.com/)) that contains all of the data from all of the tools your company uses. It is kept in sync via an ELT/ETL tool like [Airbyte](https://airbyte.com/). The primary constraint for these tools is that they *must be read-only*. This fundamental restriction significantly reduces the security surface area. * **Read-Only Enforcement:** The database roles associated with exploratory tools must explicitly have `SELECT` privileges only, with no `INSERT`, `UPDATE`, or `DELETE` permissions. Well-written SQL tools will also enforce that only SELECT queries are allowed to be executed. Again, *Agents are even more susceptible to SQL Injection attacks!* Prevent them at the connection level. * **Schema Understanding:** Your agents will need to know what tables exist and what they are for. The best pattern would be to load descriptions of the tables you need into the context ahead of time, along with any reference metadata that is available. Perhaps you have a Semantic Layer (e.g. [dbt](https://docs.getdbt.com/docs/use-dbt-semantic-layer/dbt-sl) or [Cube](https://cube.dev/)) you can use, or table annotations which can be loaded from your database. Failing that, describe your schema in plain text. * **Table Descriptions:** To enable effective querying by the AI agent, provide clear and concise descriptions of the tables and their columns. It is unlikely that all of your tables will be needed for all use-cases (and that will waste tokens), so the LLM/Agent should be encouraged to learn the schema of the relevant tables before trying to query them. A "Look → Plan → Query" workflow works great. * **Querying Best Practices:** While exploratory tools can be more general, encourage practices such as: * **Limiting Result Sets:** Require a `LIMIT` clause in its queries to prevent excessively large result sets. * **Specific Column Selection:** Encourage selecting only necessary columns rather than `SELECT *`. * **`EXPLAIN ANALYZE` (for development/debugging):** While not for the LLM to run directly in production, explaining the query plan can help in tool development. * **RetryableToolErrors for Workflow Learning:** Implement custom error types like RetryableToolError when the LLM attempts an invalid exploratory query (e.g. hallucinating columns that don't exist). This signals to the LLM that it needs to inspect the available tables and their descriptions *before* attempting the next query, teaching it a more robust workflow. Learn more about retryable tool errors [here](https://docs.arcade.dev/home/build-tools/retry-tools-with-improved-prompt). ### Examples of Exploratory Tools A good set of exploratory SQL tools for data exploration might look like: **Discover Schema:** ```python @tool(requires_secrets=["DATABASE_CONNECTION_STRING"]) async def discover_schemas( context: ToolContext, ) -> list[str]: """ Discover all the schemas in the postgres database. """ @tool(requires_secrets=["DATABASE_CONNECTION_STRING"]) async def discover_tables( context: ToolContext, schema_name: Annotated[ str, "The database schema to discover tables in (default value: 'public')" ] = "public", ) -> list[str]: """ Discover all the tables in the postgres database when the list of tables is not known. ALWAYS use this tool before any other tool that requires a table name. """ ``` **Get Table Schema:** ```python @tool(requires_secrets=["DATABASE_CONNECTION_STRING"]) async def get_table_schema( context: ToolContext, schema_name: Annotated[str, "The database schema to get the table schema of"], table_name: Annotated[str, "The table to get the schema of"], ) -> list[str]: """ Get the schema/structure of a postgres table in the postgres database when the schema is not known, and the name of the table is provided. This tool should ALWAYS be used before executing any query. All tables in the query must be discovered first using the tool. """ ``` **Execute Select Query:** ```python @tool(requires_secrets=["DATABASE_CONNECTION_STRING"]) async def execute_select_query( context: ToolContext, select_clause: Annotated[ str, "This is the part of the SQL query that comes after the SELECT keyword with a comma separated list of columns you wish to return. Do not include the SELECT keyword.", ], from_clause: Annotated[ str, "This is the part of the SQL query that comes after the FROM keyword. Do not include the FROM keyword.", ], limit: Annotated[ int, "The maximum number of rows to return. This is the LIMIT clause of the query. Default: 100.", ] = 100, offset: Annotated[ int, "The number of rows to skip. This is the OFFSET clause of the query. Default: 0." ] = 0, join_clause: Annotated[ str | None, "This is the part of the SQL query that comes after the JOIN keyword. Do not include the JOIN keyword. If no join is needed, leave this blank.", ] = None, where_clause: Annotated[ str | None, "This is the part of the SQL query that comes after the WHERE keyword. Do not include the WHERE keyword. If no where clause is needed, leave this blank.", ] = None, having_clause: Annotated[ str | None, "This is the part of the SQL query that comes after the HAVING keyword. Do not include the HAVING keyword. If no having clause is needed, leave this blank.", ] = None, group_by_clause: Annotated[ str | None, "This is the part of the SQL query that comes after the GROUP BY keyword. Do not include the GROUP BY keyword. If no group by clause is needed, leave this blank.", ] = None, order_by_clause: Annotated[ str | None, "This is the part of the SQL query that comes after the ORDER BY keyword. Do not include the ORDER BY keyword. If no order by clause is needed, leave this blank.", ] = None, with_clause: Annotated[ str | None, "This is the part of the SQL query that comes after the WITH keyword when basing the query on a virtual table. If no WITH clause is needed, leave this blank.", ] = None, ) -> list[str]: """ You have a connection to a postgres database. Execute a SELECT query and return the results against the postgres database. No other queries (INSERT, UPDATE, DELETE, etc.) are allowed. ONLY use this tool if you have already loaded the schema of the tables you need to query. Use the tool to load the schema if not already known. The final query will be constructed as follows: SELECT {select_query_part} FROM {from_clause} JOIN {join_clause} WHERE {where_clause} HAVING {having_clause} ORDER BY {order_by_clause} LIMIT {limit} OFFSET {offset} When running queries, follow these rules which will help avoid errors: * Never "select *" from a table. Always select the columns you need. * Always order your results by the most important columns first. If you aren't sure, order by the primary key. * Always use case-insensitive queries to match strings in the query. * Always trim strings in the query. * Prefer LIKE queries over direct string matches or regex queries. * Only join on columns that are indexed or the primary key. Do not join on arbitrary columns. """ ``` You can see an example exploratory Postgres Arcade toolkit [here](https://github.com/ArcadeAI/arcade-ai/blob/main/toolkits/postgres/arcade_postgres/tools/postgres.py). Remember above how we discussed *schema understanding*? Note how these general purpose tools lack that - which means that they won't be as effective as they could be. Imagine that each of these tools could be given additional context about the structure of your database: * Which are the final/gold tables in your [Medallion Architecture](https://www.databricks.com/glossary/medallion-architecture), and therefore the LLM should prefer them for most queries? * Which tables are the most useful to analysis in general (e.g. users or accounts?). * Which types of questions prefer which tables (e.g. financial questions should start with the normalized\_accounts tables). * Any "translations" the LLM might need to find your data (e.g. all payment and transaction information is in USD, listed in cents). Hinting official preferences to the LLM will save a lot of time and tokens! ### On Dynamic Schema Loading: As your schema grows, you will encounter performance and context limitations - you won't be able to pre-load your whole schema into the LLM - it will be too large. When that happens, you'll need to start looking into dynamically loading your schema as needed via "discovery" tools, or memory compression techniques. Consider the example at the start of this article. A multi-turn agent used the hints in our tools to properly build out the tool-calling workflow for itself: ![Multi-turn agent workflow example showing tool-calling workflow](/images/posts/2025-07-23-designing-sql-tools-for-ai-agents/workflow-example.png) The Tools as defined above were how we prompted the LLM to inspect the database and find only the tables it needed, and then load their schema - saving time and tokens. ## Customization vs. Generality While general SQL querying tools can be useful, remember that tools specifically designed for a use case will be more reliable and less prone to errors than allowing the LLM to construct arbitrary SQL queries. *Yes, the end goal of all Exploratory tools is to convert them into Operational tools once you have your query dialed in.* We are moving tools up from the "service" tier to the "workflow" tier, which has better quality and lower latency. We can classify the steps on this journey and some of the changing design criteria: 1. **Exploratory (service tier):** 1. Low accuracy tools which require human hand-holding 2. Highly general tools which require elevated permissions 3. low token count 4. heavy LLM reliance 5. Likely a low number of tools for the LLM to choose from 6. Example: `execute_query()` 2. **Hybrid:** 1. More accurate, 2. Still general, but limited to a specific domain, 3. high token count * Example `GetRecentSalesWins()` 3. **Operational (workflow tier)** 1. Very accurate, appropriate for operationalization 2. Highly specific and can work with tightly scoped permissions 3. very high token count, 4. low LLM reliance 5. Likely a high number of tools for the LLM to choose from 6. Example: `GetRecentSalesWinsBySalesperson()` ## What's Next? In closing, it *is* possible to build effective and safe SQL tools for Agents. But, you need to be clear about what tools the agent can call, and create specific boundaries to keep them safe. ![Guard dog representing security for AI agents](/images/posts/2025-07-23-designing-sql-tools-for-ai-agents/guard-dog.jpg) --- --- url: /blog/post/2020-11-13-developing-grouparoo-on-macos-big-sur.md description: Learn how to run a Typescript app on macOS Big Sur. Find out more. --- ![macOS Big Sur Screenshot](/images/posts/2020-11-13-developing-grouparoo-on-macos-big-sur/big-sur.jpeg) The [newest release of macOS](https://www.apple.com/newsroom/2020/11/macos-big-sur-is-here/) is out! Like any new OS release, there are plenty of new features... and new bugs to squash. The Grouparoo team uses develops on macOS, and we've taken notes about what we needed to do to continue being productive though the upgrade. ## Update Homebrew and Databases Like most macOS developers, we install our dependencies and database with [`Homebrew`](https://brew.sh), a great package manager for macOS. The first thing I checked after the upgrade was if my [Homebrew services](https://github.com/Homebrew/homebrew-services) were running. Well... they were not. ![macOS Big Sur Screenshot](/images/posts/2020-11-13-developing-grouparoo-on-macos-big-sur/homebrew.png) The good news is that newer versions of Homebrew work with macOS Big Sur - but you need to `brew upgrade`. ```bash brew upgrade ``` Pay attention - you'll likely be asked for your password. This command will update Homebrew itself and **all** of your installed packages to the latest version. This step is important, because many packages will need to be re-compiled with the newer version of XCode you now have. Upgrading all of your packages is a big step. While not related to Big Sur, when I ran `brew upgrade`, I bumped my Postgres version from 11 to 13. When you upgrade your Postgres version, you need to migrate your databases so they work with new version. You can accomplish this via: ```bash brew services stop postgres brew postgresql-upgrade-database brew services restart postgres ``` Finally, the `brew upgrade` command will have fixed any file permissions that changed during the upgrade. Restart any other running Homebrew services you have running. In my case, I needed to restart Redis, as it couldn't write to the file system until after the upgrade. ```bash brew services restart redis ``` ## Rebuild Node.js packages Node.js continued to function just fine after the macOS upgrade, but since XCode and various underlying libraries have been changed, I needed to re-compile any `node_modules` which had a compilation step. The easiest way to do this is just to re-install everything: ```bash rm -r node_modules npm install ``` At Grouparoo, we use [`pnpm`](https://pnpm.js.org) to manage dependencies in our monorepo. In our case, there's a single command to rebuild our dependencies: ```bash pnpm install --force ``` And that's all it took to get back to work on macOS Big Sur! --- --- url: /blog/post/2017-04-03-dont-be-a-jerk-oss.md description: >- A few years back, I was disappointed that so many open source licenses were opaque and hard to understand. So in 2011 I created a satirical license. --- A few years back, I was disappointed that so many open source licenses were opaque and hard to understand. So in 2011 I created a satirical license called `don’t-be-a-jerk` which roughly covered everything I thought an open source license would need. And then 6 years passed. Today, I ran across this license in the wild! This license has been translated into 4 languages, and apparently, is good enough for some projects! Neat. ![](/images/medium-export/1__K59NnX6vCMF8__MRPKcvScg.png) Here is the English text of the **Don’t be a Jerk** Open-Source software license, as of April 3, 2017. ```text Don't Be a Jerk: The Open Source Software License. Last Update: March 19, 2015 This software is free and open source. - *I* am the software author. *I* might be a *we*, but that's OK. - *You* are the user of this software. *You* might also be a *we*, and that's also OK! > This is free software. I will never charge you to use, license, or obtain this software. Doing so would make me a jerk. > I will never take down or start charging for what is available today. Doing so would make me a jerk. > You may use this code (and by "code" I mean *anything* contained within in this project) for whatever you want. Personal use, Educational use, Corporate use, Military use, and all other uses are OK! Limiting how you can use something free would make me a jerk. > I offer no warranty on anything, ever. I've tried to ensure that there are no gaping security holes where using this software might automatically send your credit card information to aliens or erase your entire hard drive, but it might happen. I'm sorry. However, I warned you, so you can't sue me. Suing people over free software would make you a jerk. > If you find bugs, it would be nice if you let me know so I can fix them. You don't have to, but not doing so would make you a jerk. > Speaking of bugs, I am not obligated to fix anything nor am I obligated to add a feature for you. Feeling entitled about free software would make you a jerk. > If you add a new feature or fix a bug, it would be nice if you contributed it back to the project. You don't have to, but not doing so would make you a jerk. The repository/site you obtained this software from should contain a way for you to contact me. Contributing to open source makes you awesome! > If you use this software, you don't have to give me any credit, but it would be nice. Don't be a jerk. Enjoy your free software! ``` The latest version can be found here: [**evantahler/Dont-be-a-Jerk**](https://github.com/evantahler/Dont-be-a-Jerk) --- --- url: /blog/post/2021-04-13-google-cloud-run-no-background-job.md description: 'Google Cloud Run is a great platform as a service, but not for background jobs' --- Grouparoo is a self-hosted product, so we are always looking for the simplest ways to help our customers run the application. A new member of the Google Cloud Platform (GCP) family is [Google Cloud Run](https://cloud.google.com/run) - which is the closest Google has come yet to a Heroku-like "Git-Ops" way to deploy your applications. It handles load balancing, scaling, and more for you and is a really compelling product. Combined with [Google Cloud Build](https://cloud.google.com/build), you can wire up your service to (re)deploy automatically when your git repository changes. Grouparoo was easy to run on Google Cloud Run, with a few caveats: 1. You'll need a VPC connector to bridge the Cloud Run networks and any other services you might be running (like a postgres database or redis service). [learn more](https://cloud.google.com/vpc/docs/configure-serverless-vpc-access#creating_a_connector) 2. When configuring hosted Redis for Grouparoo (via Google Memorystore), be sure to enable an authentication string, but not encryption in transit. 3. You'll be using "Google Cloud Build" to manage deployments. The Cloud Build service will also need access the `Serverless VPC Access User` and `Compute Network Admin` IAM roles. After those steps, and setting our environment variables, we had Grouparoo running on Google Cloud Run! But... it didn't last long. Every few hours, we would notice that our job throughput would grind to 0. We would then visit the site to look for failures, but everything appeared to be OK, and would start working again. However, only a few hours later, things would slow down again. What was going on? After some digging, we learned that Google Cloud Run throttled based on HTTP requests, and *only* HTTP requests. > When an application running on Cloud Run finishes handling a request, the container instance's access to CPU will be disabled or severely limited. Therefore, you should not start background threads or routines that run outside the scope of the request handlers. This makes Google Cloud Run a poor platform choice for an application like Grouparoo which manages its own background jobs, and expects at least one instance to always be available for scheduling. You can learn more [here](https://cloud.google.com/run/docs/tips/general#avoiding_background_activities). That explains why when we visited the site, things started working again, and stopped after that. *** If you are looking to run Grouparoo on GCP, check out our [Google Cloud example project](https://github.com/grouparoo/app-example-gcp) which uses node.js natively, and connects to a hosted Redis and Postgres database. --- --- url: /blog/post/2021-03-17-dont-use-underscores-in-http-headers.md description: >- Don't use underscores in your HTTP headers... at least according to AWS and Nginx! --- Don’t use underscores in your HTTP Headers... at least according to AWS and Nginx! ```bash # TLDR; curl --HEADER "AUTH_TOKEN: abc" example.com # is bad curl --HEADER "AUTH-TOKEN: abc" example.com # is OK ``` Grouparoo is a self-hosted application, and we are always helping folks run and deploy our service in new ways. Recently, we’ve been working an [example application](https://github.com/grouparoo/app-example-aws) for Amazon Web Service’s (AWS) Elastic Beanstalk service. Elastic Beanstalk is AWS’s original Platform as a Service (PaaS) which means you can deploy your application without having to directly manage the servers yourself. Also, Elastic Beanstalk servers are within you Virtual Private Cloud (VPC), so they are good choice if you want to integrate with any of AWS’ other services, like a database or cache. The `app-example-aws` app deployed just fine, but we were running into a strange bug: users of our web UI would be logged out on every subsequent page load! We weren’t seeing this behavior on any of the other hosting platforms we’ve used, including other AWS deployments hosting options. What could be wrong? Eventually we narrowed down the problem to communication between our website UI and the API server. The Grouparoo UI server uses a special header, `X-GROUPAROO-SERVER_TOKEN` , along with the users’s session cookie, to authenticate against the API to pre-hydrate our pages on behalf of the user making the request. This page hydration request was the only type of request failing. Eventually we got into the weeds of the network request, and saw that the API was never receiving the `X-GROUPAROO-SERVER_TOKEN` header, but everything else was coming though OK: ```json { "headers": { "connection": "upgrade", "host": "app-example-aws.example.com", "x-real-ip": "172.31.xxx.xxx", "x-forwarded-for": "54.157.xxx.xxx, 172.31.xxx.xxx", "x-forwarded-proto": "https", "x-forwarded-port": "443", "x-amzn-trace-id": "Root=1-60517ca4-xxxxx", "accept": "application/json", "content-type": "application/json", "cookie": "grouparooSessionId=xxxxxxxxxx", "user-agent": "axios/0.21.1" } } ``` After some digging, we learned that Elastic Beanstalk fronts its applications with Nginx acting as a reverse proxy, which by default, considers headers with underscores CGI commands of yore and ignores them. By default the Nginx option `underscores_in_headers ` is `off` , and you can learn more [here](https://www.nginx.com/resources/wiki/start/topics/tutorials/config_pitfalls/?highlight=underscore#missing-disappearing-http-headers). Please note that using underscores in headers is perfectly valid per the HTTP spec, but Nginx, by default, will ignore them. ![A header in football. From https://unsplash.com/photos/JqCpvGN0JFo](/images/posts/2021-03-17-dont-use-underscores-in-http-headers/header.jpeg) It was a matter of preference whether or not right thing to do was to change the header we use and not use underscores (`X-GROUPAROO-SERVER-TOKEN`) , or to modify the Nginx reverse proxy configuration on our Elastic Beanstalk servers (which is possible - [see here](https://docs.aws.amazon.com/elasticbeanstalk/latest/dg/java-se-nginx.html)). At the end of the day we want Grouparoo to work out-of-the box on as many platforms as possible without needing custom configuration. Nginx is a wildly popular web server, load balancer, and reverse proxy - and we should to be compatible with its defaults. To that end, we opted to change our headers not to use underscores. I recommend that everyone else developing a web application do the same and follow Nginx's defaults to avoid problems like this down the road. --- --- url: /blog/post/2014-01-06-elasticdump.md description: >- ElasticDump - The Node.js CLI tool for importing and exporting Elaticsearch data --- ![](/images/medium-export/1__OopZJXTgMJhT1S0pBaGapQ.jpeg) ### Intro At [TaskRabbit](http://www.taskrabbit.com), we use [ElasticSearch](http://www.elasticsearch.org) for a number of things (which include search of course). In our development, we follow the normal pattern of having a few distinct environments which we use to build and test our code. The ‘acceptance’ environment is supposed to be a mirror of production, including having a copy of its data. However, we could not find a good tool to help us copy our Elastic Search indices… [so we made one](https://github.com/taskrabbit/elasticsearch-dump)! [**taskrabbit/elasticsearch-dump**](https://github.com/taskrabbit/elasticsearch-dump) ### Use elasticdump works by sending an input to an output. Both can be either an elasticsearch URL or a File. * Elasticsearch: * format: `{proticol}://{host}:{port}/{index}` * example: `http://127.0.0.1:9200/my_index` * File: * format: `{FilePath}` * example: `/Users/evantahler/Desktop/dump.json` You can then do things like: Copy an index from production to staging: ```bash elasticdump --input=http://production.es.com:9200/my_index --output=http://staging.es.com:9200/my_index ``` Backup an index to a file: ```bash elasticdump --input=http://production.es.com:9200/my_index --output=/var/dat/es.json ``` ### Options * — input (required) (see above) * — output (required) (see above) * — limit how many objects to move in bulk per operation (default: 100) * — debug display the elasticsearch commands being used (default: false) * — delete delete documents one-by-one from the input as they are moved (default: false) ### Notes * elasticdump (and elasticsearch in general) will create indices if they don’t exist upon import * we are using the put method to write objects. This means new objects will be created and old objects with the same ID will be updated * the file transport will overwrite any existing files * If you need basic http auth, you can use it like this: — input=http://name:password@production.es.com:9200/my\_index Inspired by and You can download elasticdump from [NPM](https://npmjs.org/package/elasticdump) or [GitHub](https://github.com/taskrabbit/elasticsearch-dump) *Originally published at 06 Jan 2014* --- --- url: /blog/post/2015-01-07-elasticseach-production-notes-part-2.md description: Fast and Stable Elasticsearch in production... again! --- This post is a continuation to my previous post, [Elasticsearch in Production](http://tech.taskrabbit.com/blog/2014/07/18/elasticsearch-in-production/). It has been a few months since we became *heavy* elasticsearch production users, and this is the first time that I feel our cluster is stable. It has been a long road for us, which included creating tools like [ElasticDump](http://tech.taskrabbit.com/blog/2014/01/06/elasticsearch-dump/) and [Waistband](http://tech.taskrabbit.com/blog/2014/03/14/waistband), a number of version changes, and custom integrations, but we are finally there! With that in mind, here is the up-to-date list of what we did/learned to stabilize the TaskRabbit Elasticsearch cluster. ### Treat it Like a Database (again) This mental shift is still the most important thing. You need backups (we use the [AWS elasticsearch plugin](https://github.com/elasticsearch/elasticsearch-cloud-aws#s3-repository) to make nightly snapshots to S3), you need import/export tools (why we made [ElasticDump](http://tech.taskrabbit.com/blog/2014/01/06/elasticsearch-dump/)), and you need good monitoring. 2 new plugins we have recently installed are [elasticsearch-HQ](https://github.com/royrusso/elasticsearch-HQ) and [Whatson](https://github.com/xyu/elasticsearch-whatson). [Whatson](https://github.com/xyu/elasticsearch-whatson) helps you visualize the commit state of your Lucene indexes. We learned via this tool that some of our indexes are constantly being written to, and thus we usually have a long delay before data is committed to disk. ![](/images/medium-export/0__SnPDTeQgntuVKyK9.png) [elasticsearch-HQ](https://github.com/royrusso/elasticsearch-HQ) offers a ton of great features, but one of the most useful for us was the Heap and JVM visualizers. One of the ways we identified a bad query (discussed more below) was by noticing how fast our heap utilization was growing. We knew we had solved the bug when the heap growth velocity went down 3x. ![](/images/medium-export/0__dE5HGrEBP5__LmlDJ.png) ### Data Durability Elasticsearch is the source-of-truth for some of our data. We had experienced [failure scenarios](https://groups.google.com/forum/#!topic/elasticsearch/M17mgdZnikk) where nodes a cluster failure caused significant data loss. This is not good! Here is an example: * Cluster has 3 data nodes, A, B, and C. The index has 10 shards. The index has a replica count of 1, so A is the master and B is a replica. C is doing nothing. Re-allocation of indexes/shards is enabled. * A crashes. B takes over as master, and then starts transferring data to C as a new replica. * B crashes. C is now master with an impartial dataset. * There is a write to the index. * A and B finally reboot, and they are told that they are now stale (as C had a write while they were away). Both A and B delete their local data. A is chosen to be the new replica and re-sync from C. * … all the data A and B had which C never got is lost forever. While Elasticsearch’s data sharding is awesome, we have learned that for truly important data, we need to increase the replication count so that ALL nodes in the cluster contain the data. This way, you can recover from any number of lost nodes, and any node in the cluster can start a new cluster in a catastrophic failure. You can learn more about this from the excellent ["call me maybe" post series](https://aphyr.com/posts/317-call-me-maybe-elasticsearch) which tests the recovery and partition tolerances of a few common database tools. ### Keep your versions up to date! This goes without saying, but the Elasticsearch team is constantly making improvements. There are always significant performance improvements each release. There is no reason not to update/upgrade, especially because yo can do a 0-downtime rolling update with Elasticsearch. One of the new features in 1.4.x is the ability to upgrade your Lucence indexes in-line. You can learn more [here](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-upgrade.html) ### Virtualization and CPU binding Along with the rule that Elasticsearch should never use more than 49% of the system’s available RAM (and you should enable memlockall if your OS supports it), you should also never over-commit your system’s CPUs. One of the mistakes I made while debugging our cluster was attempting to allocate more CPU resources to the elasticsearch Virtual Machines, as we had learned that the number of CPUs is related to the number of open indexes (to remain performant). While true, I foolishly created 32-core VMs on physical servers which only had 16 CPUs. While this worked OK most of the time, when elasticsearch was garbage collecting (a very CPU-intensive operation) the VM would grind to a halt because there was so much overhead needed by VMware to visualize more CPUs than it actually had available. Oops. ### Geographic Resolution. One of the big improvements we made was adjusting the [resolution of geographic fields](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-geohashgrid-aggregation.html). TaskRabbit uses elasticsearch for a lot of geography searching (ie: which Taskers are available tomorrow within SOMA to help me with my Task?). When you import a geographic region into elasticsearch, it "rasterizes" the data so it can be searched more efficiently. However, the resolution of each of those "raster regions" is up to you. We only need data at the "city block" level of resolution, so we were able to reduce the index size of our geographic indexes from the default level by 1/2. This required a lot less ram to keep that data "hot" and also sped up queries! ### Wildcards We learned that [Wildcard Queries](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-wildcard-query.html) could be expensive, and should be avoided whenever possible. Think about select \* where col=x… in SQL vs select \* from table where col LIKE x…. LIKE requires to scan all the rows (and sometimes all of the columns)! A wildcard is very similar. We had been using a wildcard query over a small collection of data… or so we thought. While the number of items in the index were only in the few thousands, the *width* of each element was huge. This meant that Elasticsearch needed to load the whole record into ram to parse it. This was what caused the huge Heap growth (and decay) I mentioned earlier. Don’t use wildcards in your queries. ![](/images/medium-export/0__a978q13jIcjwE3kN.png) ### The JVM and You We still tune our JVM as we had talked about in our [last post](http://blog.evantahler.com/blog/2014/07/18/elasticsearch-in-production). We are still using the Oracle version of Java and the new GC1 Garbage Collector via JAVA\_OPTS="$JAVA\_OPTS -XX:+UseG1GC" There as a big improvement in stability with Oracale’s JVM over OpenJDK, and we have seen no problems with the "unsupported", faster GC. See the [previous post](http://blog.evantahler.com/blog/2014/07/18/elasticsearch-in-production) for more information. ### What didn’t work **Bulk updates**. When you write a document to Elasticsearch, it is first added to the index, and then eventually indexed so it will appear in search, and this process happens asynchronously from the write. The [bulk APIs](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/docs-bulk.html) allow you to pause the indexer while you write a large number of documents, and turn it on when you are done. This is supposed to allow you to save CPU cycles and allow you to have a searchable snapshot of data that won’t change while you are applying updates (*almost* like a commit in SQL terms). However, in our case, after we wrote our bulk data and re-enabled indexing, the CPU required to index the new data drastically slowed down all other operations in the cluster. We ended up keeping our "bulk updates" as many individual writes. While this has the properly of slowing down each write, we end up keep the whole cluster more stable. ### What’s next? **Nginx**. We are going to start proxying all Elasticsearch HTTP requests though nginx. This way, we can use the nginx logs to have a record of what queries and params we are using, and how long they take. This data in aggregate is a far better way to look for and analyze slow queries. We use [Sumologic](http://www.sumologic.com/) for this purpose. Down the road, this will also allow us to restrict certain routes and verbs to authenticated users (like DELETE / for instance). **Waistband Permissions**. While Elasticsearch itself doesn’t have users or permissions ([yet](http://www.elasticsearch.org/overview/shield/)), we can emulate some of this behavior in our client library. We can set rules so that a given application or connection won’t be allowed to issue PUT or DELETE requests to the cluster. This will add a level of security to our development we didn’t have before. **Parent-child data separation**. As mentioned above, we still have large bulk updates. However, the majority of this data doesn’t change frequently. We can separate out what is essentially one wide table into a few smaller parent/child tables. This will make the data volume we write smaller, which in turn will speed up indexing. ### Closing Thoughts * I want to thank [Doc](http://www.ministryofvelocity.com/) for spending time with us and helping us debug our cluster. * The [Elasticsearch mailing list](https://groups.google.com/forum/#!forum/elasticsearch) is a great resource. The community is very friendly and welcoming. You should subscribe to the daily digest and browse the headlines… you will certainly see something relevant to your interests within a week. --- --- url: /blog/post/2014-07-18-elasticsearch-in-production.md description: Fast and Stable Elasticsearch --- ### Intro Here at [TaskRabbit](http://www.taskrabbit.com), we have relied on [ElasticSearch](http://www.elasticsearch.org/) for over a year now. We started using it as log storage of all the events flowing though our [resque-bus](http://blog.evantahler.com/blog/2013/09/28/resque-bus/), as ElasticSearch is a great distributed storage tool. We create an index for each month (~20GB) and we can scan/search for anything we need. Then, we started using ElasticSearch for, you know, search. We once again used resque-bus to populate all Taskers and Tasks into their respective indexes and it worked exactly as intended. Next, we started using ElasticSearch to populate recommendations to Taskers about open tasks they might want to do. We created a more complex search query matching their interests and past tasks with what was available in the marketplace. ![](/images/medium-export/0__ty4Zy3gdHSDzAyqO.png) Then, we did the same for Clients! ![](/images/medium-export/0__neqvPC4dDzPTDT9w.png) When we made the switch to the [New TaskRabbit](http://blog.taskrabbit.com/2014/07/10/the-new-taskrabbit-is-here-with-new-ios-android-apps-for-clients-and-1m-insurance-policy-on-every-task/) this summer, we started relying on ElasticSearch even more heavily. When you search for a Tasker to help you, our algorithm relies very heavily on ElasticSearch. There is a document for each Tasker which contains all the relevant info needed, which is hit with every query. ![](/images/medium-export/0__cyxqq__EMJC__vyVkz.png) These changes reflect a consistent increase in the load we have applied to ElasicSearch over the past year… and eventually we pushed it to the limit :( For the first few days of the New TaskRabbit, we had at least one catastrophic ElasitcSearch crash each day. ElasticSearch is now in the path to posting a task, it is very important for us to keep it running. This is the chronicle of how we debugged ElasticSearch and got it working again! ### Treat it like a DB First, we needed to make the mental shift about how we were thinking about ElasticEarch. In the old TaskRabbit, ElasticSearch could go down, and you could still hire people. This is no longer the case. More interestingly, in the old system ElasticSearch was not a primary store for any data, and we could always repopulate it with rake jobs. Now, we had 3 tiers of ElasticSearch data: Primary Storage, Search Cache, and Logs. The "Search Cache" data can still be rebuilt and the site still works if we lose our logs, but ElasticSearch had become the primary storage engine for communications between Taskers and Clients. This means that ElasticSearch is now as important to TaskRabbit as the mySQL cluster. In order to handle this shift, we did a few things. First, we needed a tool to backup this index periodically. We really wanted a tool like mysqldump for ElasticSearch, but there wasn’t one… so we made [ElasticDump](http://tech.taskrabbit.com/blog/2014/01/06/elasticsearch-dump/). We use this tool to snapshot the index, gzip the JSON, and store it off line. We can also use the same tool to load the data into out staging environment to debug problems with production data. Finally, we made sure that the rake jobs we use to repopulate the search indexes are as fast as possible. Rather than one rake job to load everything in (which would take days), we once again relied on resque-bus to fan out the loading process to as many worker nodes as possible. This allows us to load in all the data we need in only a few hours (in the case of catastrophic failure). ### GeoJSON and CPU One of the most CPU intensive operations we have seen within ElasticSearch is the loading of a new GeoJSON shape. Every TaskRabbit Tasker draws a map of where they want to work, and the parsing of these shapes (and the metros of every city) is very CPU intensive. We learned that the default precision of 5m for every cell in a geohash was simply too detailed for our needs. [Changing this mapping](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl-geohash-cell-filter.html) made our indexing speed orders of magnitude faster. ### The JVM and You ElasticSearch runs on Java, and tuning the JVM’s memory management is something that every operations engineer has do in their career… and it is never easy. We were victims of a very long and painful garbage collection cycle, and it took us a while to learn how to resolve it. All we noticed at first was that for about 5 minutes, all data nodes in our ElasticSearch cluster would fail every few hours. Step 1 was to get better logging. ElasticSearch has a slowQuery log you can enable in elasticsearch.yml: ```bash index.search.slowlog.threshold.query.warn: 10s index.search.slowlog.threshold.query.info: 5s index.search.slowlog.threshold.query.debug: 2s index.search.slowlog.threshold.query.trace: 500ms index.search.slowlog.threshold.fetch.warn: 1s index.search.slowlog.threshold.fetch.info: 800ms index.search.slowlog.threshold.fetch.debug: 500ms index.search.slowlog.threshold.fetch.trace: 200ms index.indexing.slowlog.threshold.index.warn: 10s index.indexing.slowlog.threshold.index.info: 5s index.indexing.slowlog.threshold.index.debug: 2s index.indexing.slowlog.threshold.index.trace: 500ms ``` Enabling this showed us that in the downtime windows, we were still serving queries, but they were taking seconds to complete where normally they take milliseconds. Correlating that with the normal logs, we learned that these slower queries were blocking later queries from being parsed, and they were simply being rejected: ```text ... Caused by: org.elasticsearch.common.util.concurrent.EsRejectedExecutionException: rejected execution **(**queue capacity 1000**)** on org.elasticsearch.transport.netty.MessageChannelHandler$RequestHandler ... ``` This indicates that we had a cascade failure! If one node slows down, requests would be routed to the next one, and compound the problem. Naively, we thought we simply had a throughput problem. We changed our ElasticSearch nodes over to larger hosts, greatly increasing the amount of RAM and CPU available. This however, made the problem worse. While it took longer for our failures to occur from a cold-boot, we went from 5 minutes of downtime to 15! Our [system monitoring](https://mmonit.com/) showed us that the host was never out of ram, but during these outages we were pegging the CPU at 100%. What was going on? ![](/images/medium-export/0__snWdTeknvk2Ksn5W.png) ElasticSearch will note in its normal logs if any GC is happening which effects a slow query, but we wanted more detail. Another logging option ElasticSearch has is ES\_USE\_GC\_LOGGING which sets some JVM variables: ```ruby # from elasticsearch.in.sh.erb if [ "x$ES_USE_GC_LOGGING" != "x" ]; then JAVA_OPTS="$JAVA_OPTS -XX:+PrintGCDetails" JAVA_OPTS="$JAVA_OPTS -XX:+PrintGCTimeStamps" JAVA_OPTS="$JAVA_OPTS -XX:+PrintClassHistogram" JAVA_OPTS="$JAVA_OPTS -XX:+PrintTenuringDistribution" JAVA_OPTS="$JAVA_OPTS -XX:+PrintGCApplicationStoppedTime" JAVA_OPTS="$JAVA_OPTS -Xloggc:/var/log/elasticsearch/gc.log" fi ``` With these options enabled, a new log file will be created which will log the occurrences and durations of each GC event. We confirmed that ElasticSearch was in fact garbage collecting for large amounts of time… and with more ram, it took longer when something went wrong. The JVM (unlike ruby) is good at obeying its min and max allocation, so the system ram was never overloaded, which was what we were used to looking for as a problem signal. The ES\_HEAP\_SIZE environment variable is used for this purpose. You can modify it within /etc/init.d/elasticsearch on Ubuntu. To fix the long garbage collection time, we switched from openJDK to Oracle’s "official" Java branch. We also switched to the G1GC garbage collector from the default ConcMarkSweepGC method. To change this, you will need to modify bin/elasticsearch.in.sh, as the older options are hard-coded: ```text ... # JAVA_OPTS="$JAVA_OPTS -XX:+UseParNewGC" # JAVA_OPTS="$JAVA_OPTS -XX:+UseConcMarkSweepGC" # JAVA_OPTS="$JAVA_OPTS -XX:CMSInitiatingOccupancyFraction=75" # JAVA_OPTS="$JAVA_OPTS -XX:+UseCMSInitiatingOccupancyOnly" # Try new GC JAVA_OPTS="$JAVA_OPTS -XX:+UseG1GC" ... ``` Depending on your OS, you may also need to modify /etc/init.d/elasticsearch to point to the proper version of the Java binary. --- --- url: /blog/post/2012-05-22-electric-imp.md description: >- Where have I been for the past month? Well, I’ve been heads-down helping to launch the internet of things. Now that we have launched, I can… --- ![](/images/medium-export/1__NiynxQ3d7ZQollTpvh7l0A.png) Where have I been for the past month? Well, I’ve been heads-down helping to launch the internet of things. Now that we have launched, I can tell you all about it! ![](/images/medium-export/0__nv10WME8B__6IgrwJ.jpeg) I’ve had the chance to help the team over at Electric Imp un-stealth and blow minds. In a nutshell, the imp a tiny computer small and cheap enough to get anything online (truly)! [Gizmodo does a a good job of explaining the imp](http://www.youtube.com/embed/ezFsOBQCcPU) and [Engadget has some more depth in their interview with Hugo](http://www.engadget.com/2012/05/21/hands-on-with-the-electric-imp-at-maker-faire-video/) at Maker Faire . I like to think that I’m a fairly sharp nerd, and when I saw Electric Imp’s demo over a month ago I was *actually* blown away. The ideas aren’t new, but the execution is superb. For me, the 3 things that are required for the "internet of things" to really take off are: * Cost: It needs to be cheap. With a (starting) price point of only $25 the imp is in the right ballpark. Also, the fact that you can pass imps around between your devices lowers the aggregate cost over time. * APIs. The value of the internet is that stuff connects to other stuff (tm). If the platform is a walled garden, than it’s not going to work. The hardware APIs on the imp are excellent, as the imp speaks just about every serial protocol that exists (spy, UART, etc) and the web service is designed to be very accessible (I made sure that every imp in your plan has its own JSON feed you can check) and the team will be adding more services all the time. * My mom can use it. Did you see that super-sexy drag-and-drop planner. Yeah. My time with Electric Imp was short, but it was an awesome experience working with them and presenting to the press and working the floor at Maker Faire. Yes, my next [internet-enabled nerf gun](http://blog.evantahler.com/pivotal-tracker-phidgets-and-nerf-guns) will be imp-powered. ![](/images/medium-export/0__CRBx2tNgIrSjzq51.jpeg) [Get on the mailing list for the dev kits](http://www.electricimp.com/interested/) (there may not be enough)! #### [http://www.electricimp.com](http://www.electricimp.com/) --- --- url: /contact.md description: Get in touch with Evan Tahler. --- # Contact ![contact](/images/contact-3.png) ## Let's chat! If you are looking to hire me on a contractor basis, please reach out via [Delicious Hat](https://www.delicioushat.com), my web technology development and consulting firm. If you are looking for help with ActionHero, please [join the ActionHero community](https://www.actionherojs.com/community). Otherwise — [schedule a chat with me](https://calendar.google.com/calendar/u/0/appointments/schedules/AcZssZ16A24TTpkRzn9hmftYCLsGXROlTohM1YCsVjklKO7CPv4H56FX00q4-4HOwH5OZvO_EaeSVD67). --- --- url: /open-source.md description: >- Evan Tahler's contributions to open source software including Actionhero, Grouparoo, Airbyte, and node-resque. --- # Open Source ![open source](/images/open-source-3.png) I contribute to a number of open source projects because I believe it is a great way to give back to the programming community in both a professional and personal capacity — and a great way to learn about new technologies and tools. You can [sponsor my open source work via GitHub Sponsorships](https://github.com/users/evantahler/sponsorship). ## Featured Projects ### ArcadeAI/arcade-mcp ### airbytehq/airbyte ### grouparoo/grouparoo ### actionhero/actionhero ### actionhero/node-resque ### actionhero/ah-sequelize-plugin ### taskrabbit/elasticsearch-dump ### taskrabbit/empujar ### evantahler/dont-be-a-jerk --- --- url: /speaking.md description: >- Talks Evan Tahler has given on Node.js, Ruby, DevOps, AI, Data Engineering and Product Management. --- # Speaking ![speaking](/images/speaking-3.png) I've given a number of technical talks, focusing on Node.js, Ruby, AI, and DevOps. ## Featured Talks ### Tools! — A History of Agents Doing Stuff *AI Agents SF #9: Past, Present, and Future — December, 2025* How agents escaped the chat box and gained access to hundreds of MCP servers, the patterns emerging for keeping them safe, and where things are heading next — with risky live demos. Followed by a panel on the past, present, and future of AI Agents alongside Erik Meijer (Normal Computing), Vincent Koc, and Allie Jones. * [Event Page](https://luma.com/kff29lg4) ### MCP After Dark: Live Demo of MongoDB × Arcade *SF Tech Week (Arcade.dev HQ) — October, 2025* A live demo of the MongoDB integration with Arcade, co-presented with Anaiya R (Senior Technical Evangelist, MongoDB), at Arcade.dev's SF Tech Week event. The night also featured a panel on the next generation of products built on MCP with Nate Barbettini (Arcade), Max Gerber (Stytch), David Garnitz (Yapify), and Apoorva Joshi (MongoDB), moderated by Gabriela de Queiroz. * [Event Page](https://partiful.com/e/6j92SfgQDCFZOJCHzPsT) ### AI in the Data Stack: From Dashboards to Agents *Arcade × Airbyte Webinar — August, 2025* A conversation with Alex Girard (Airbyte) about AI security in the modern data stack, landing on a mental model that changes everything: your LLM is just another user. We dig into the patterns and pitfalls of letting agents touch real data systems. * [Video](https://www.youtube.com/watch?v=E39LhjEpo-I) ### Design Principles for ELT Database Destinations *move(data) — December, 2023* The session will address issues such as data type errors, schema changes, and data accessibility. Attendees will learn about Airbyte's innovative approach to ensuring easy-to-query tables, decoupling sync errors from data errors, and enhancing overall data observability. * [Speaker Page](https://movedata.airbyte.com/event/design-principles-for-elt-database-destinations) ### git push your data stack with Airbyte, Airflow and dbt *Airflow Summit — May, 2022* Treat your data stack like a software project. Co-presented with Marcos Marx (Airbyte), this session walks through using Git, Airflow, and dbt to manage Airbyte connections as code — so the people who build your pipelines can ship them with the same workflow they use for everything else. * [Video](https://www.youtube.com/watch?v=_pLDo04sv2U) * [Session Page](https://airflowsummit.org/sessions/2022/git-push-your-data-stack-with-airbyte-airflow-and-dbt/) ### How I learned to Stop Worrying and Let the Robot Publish to NPM *CascadiaJS — September, 2020* As professional developers, we /probably/ don't deploy code directly to production and we /usually/ test things first. There's a whole world of tools and best practices like Git Flow, Continuous Integration, and Review Apps to help us build and deploy our apps and websites... but what about the developer tools we use every day? This talk will focus on how to parallel some of these same best-practices when making developer tools and frameworks. Together we will build a CI/CD pipeline for publishing to packages to NPM. * [Speaker Page](https://2020.cascadiajs.com/speakers/evan-tahler) ### Sharing Typescript Types Across the Stack *SeattleJS — February, 2020* Use Typescript to share types between your Frontend and Backend. Discover the shape of your data to avoid mistakes! * [Slides](https://docs.google.com/presentation/d/1LrG0ptT7K-AG1d_1f_mefgVLUGm-TcEqzPdP7n0jf_4) * [Code](https://github.com/evantahler/pokemon-typescript) * [Video](https://www.youtube.com/watch?v=tJtL4LtKQnA) ### Using Next.JS to build Static Dynamic Websites… and never pay for font-end hosting again! *SeattleJS — September, 2019* This talk was inspired by a group of students learning to code in Seattle who were being taught tools like React and Angular, but struggling to learn how to deploy their sites using modern methods. Specifically, how to set up CI/CD (Continuous Integration + Continuous Deployment) and HTTPS. * [Slides](https://speakerdeck.com/evantahler/using-next-dot-js-to-build-static-dynamic-websites-dot-dot-dot-and-never-pay-for-font-end-hosting-again) * [Code](https://github.com/evantahler/next-static-hosting) ### Background Tasks in Node.js: A survey with Redis. *RedisConf — May, 2016* Node.js' Async programming model allows us to emulate many types of advanced systems. In this talk, we will use node and redis to recreate 7 different types of background job systems, from queues to kafka. * [Video](https://www.youtube.com/watch?list=PL83Wfqi-zYZHtHoGv3PcGQA3lvE9p1eRl\&time_continue=218\&v=NNTsHzER31I) * [Slides](https://speakerdeck.com/evantahler/background-jobs-in-node-dot-js-redisconf-2016) * [Article](https://blog.evantahler.com/background-tasks-in-node-js-a-survey-with-redis-971d3575d9d2) ### Node for ! (not) HTTP *SF Node — Dec, 2015* Node.js is great for all sorts of projects. In this demo, we will use Node.js to control the lights in our house via the DMX protocol. * [Slides](https://speakerdeck.com/evantahler/node-for-not-http) * [Code](https://github.com/evantahler/node_for_not_http) --- --- url: /blog/post/2012-12-04-exception-patterns-in-node-js.md description: >- Recently we added support for a ‘developer mode’ to ActionHero which reloads parts of your project on the fly as you develop. Doing so… --- Recently we added support for a ‘developer mode’ to [ActionHero](http://actionherojs.com) which reloads parts of your project on the fly as you develop. Doing so enabled developers to go from seeing A to B ### A: Wild Exceptions ![](/images/medium-export/0__o0B__oyddYtvd8ZjV.jpeg) ### B: Caught Exceptions ![](/images/medium-export/0__S7brpr23SnqN3__pg.jpeg) Notice that not only did the application not crash in a fiery blaze, but it also returned a meaningful response back to the user ({ "error":"The server experienced an internal error" } along with a classy ‘500’ header ) . When using developer mode, ActionHero watches various files for changes, and reloads them on the fly. However, if you are like me, you are very likely to introduce a bug as you are developing. In the above example, I referenced a variable that wasn’t defined, but it’s just as likely that I might have broken the parser with some malformed JSON. When fleshing out the ‘safety features’ around developer mode (which makes liberal use of domains), I realized that there are actually 4 classes of exceptions in node, and they each have their own patterns of error handling: * Synchronous Exceptions * Asynchronous Exceptions * callback(error, data) * callback(modifiedObject, nextAction) * Domain Exceptions #### Footnote: We are talking about Exceptions not Strings First, let’s clear up that we are talking about exceptions and not passing strings or null around. Other folks have written eloquently on the topic, so let me just say that in node, there is a difference between ‘throw "my error";’ and ‘throw new Error("my error");’ most important of which is the stack trace. The node Error object has all sorts of great properties which you should check out in the docs. ### Synchronous Exceptions For synchronous functions, the most common pattern when something goes wrong is to throw an Error object. Generally speaking, returning both ‘null’ and ‘false’ might have syntactic meaning (AKA: I have null results to return), so they might be misinterpreted by clients. This might seem like it is breaking the ‘promise’ set up by using a function (in that it should return a value in all cases), but choosing to thow an exception is a blocking way to denote that a blocking function has something wrong with it Here’s a simple example: ```js var addSync = function (a, b) { if (!isNumber(a)) { throw new Error(a + " is not a number"); } if (!isNumber(b)) { throw new Error(b + " is not a number"); } return parseFloat(a) + parseFloat(b); }; function isNumber(n) { return !isNaN(parseFloat(n)) && isFinite(n); } ``` When using this pattern, the callers of your method can be sure that the function worked as intended, or that an error was thrown. Now, you might not want that thrown exception to crash you application so you can use JS’ try/catch functions to wrap a function which you think *might* thrown an exception. In this manner you can also chose to do whatever you want with the exception object returned, including displaying its stack trace, etc. This programming style enforces that your methods are called ‘correctly’ (as you have defined ‘correct’ to be), and developers must opt-in to treating your metod in a ‘fuzzy’ way by wrapping it with a try/catch block. By the way, javascript’s try{} catch(e) {} methods **only** work with synchronous functions. ```js try{ addSync(1, "word"); console.log("YAY"); }catch(err){ console.log(err.stack); } Error: word is not a number at addSync (/Users/evantahler/Desktop/test.js:3:28) at Object. (/Users/evantahler/Desktop/test.js:11:14) ``` Sometimes you don’t have the ability to ensure that your functions return an error rather than throw it. For example, the ‘require’ method will throw an error if the file you are loading has a syntax error. In these situations you have 2 options: either wrap the call in a try/catch, or use node’s ‘process.on(‘uncaughtException’, function(err) { })’ global catch. It’s generally a bad idea to use this global catcher as you might be ignoring important errors which really should cause your application to stop. In actionHero, loading in all user-created files (actions, tasks, routes, etc) are wrapped in a try/catch block so that one bad action won’t stop your application from booting (but we will log a nice error message for you), while loading the core actionHero files should still cause the application to crash if something is wrong with them. ### Asynchronous Exceptions There are 2 common async patterns in node: the callback(error, data) pattern and the callback(modifiedObject, nextAction) pattern. The most common pattern for async code is to respond to the callback with callback(error, data). This pattern follows a similar ‘promise’ an async function makes, mainly that it will eventually respond to it’s callback with information on its own operation (error) and any results of the operation. What’s nice about the (err, data) pattern is that you can always expect 2 values, and you can’t confuse which is which. ```js var addAsync = function (a, b, next) { if (typeof next != "function") { return new Error("this is an async function and expects a callback"); } if (!isNumber(a)) { next(new Error(a + " is not a number"), null); } else if (!isNumber(b)) { next(new Error(b + " is not a number"), null); } else { next(null, parseFloat(a) + parseFloat(b)); } }; function isNumber(n) { return !isNaN(parseFloat(n)) && isFinite(n); } ``` Happily in javascript, ‘null == true’ responds with false, so you can check the results of any asynchronous function using this pattern with ‘if(err){ } else { }’. In ActionHero, every action and task is technically one big async function and we use the second callback pattern. First, we know that the callback for every action will always be to some sort of renderer (depending on the connection type), and we know that we always must must respond to the client, even if there is an error. With this knowledge, we can enforce the rule that all action’s callbacks are next(connection, toRender). toRender in this case is a boolean instructing the callback weather or not it needs to render a response to the client. For example, if the action was "file", and the action already streamed down the contents of a .jpg, there’s nothing left to send to the client. But what about errors? In actions, we are always passing around the ‘connection’ object, and this object’s connection.error state is very important. It’s not only rendered back to the user, but it can also be used as a mechanism for flow control. actionHero’s promise pattern is modified to use a data object (in this case, connection) to reflect state. For example, if String(connection.error) == "user not authenticated" rather than null, you know that you probably shouldn’t run that ‘changePassword()’ method next. The promise here now becomes that the async function will always return a modified version of the data object passed to it , and instructions on what to do next. An example of actionHero’s type of callback pattern is: ```js var addAsyncWithDataObject = function(object,next){ if(!isNumber(object.a)){ object.error = new Error(object.a + ' is not a number'); } else if(!isNumber(object.b)){ object.error = new Error(object.b + ' is not a number'); } else{ object.result = parseFloat(a) + parseFloat(b); } if(object.error == null{ next(object, true); }else{ next(object, false); } } function isNumber(n) { return !isNaN(parseFloat(n)) && isFinite(n); } ``` This version of the async function would be called like this: ```js data = { a: 1, b: 2, }; addAsyncWithDataObject(data, function (data, toRender) { if (!toRender) { // handle error } else { // yay } }); ``` ### Domain Exceptions In recent node.js versions ( version >= 0.8.0 ), domains have been introduced. Domains are a tool with which you can wrap a section of your code in an asynchronous container. Think of them like a async version of try / catch. An example: ```js var domain = require(‘domain’); var addAsync = function(a,b,callback){ if(!isNumber(a)) { throw new Error(a + “is not a number”); } if(!isNumber(b)) { throw new Error(b + “is not a number”); } var response = a + b; callback(null, response); } var isNumber = function(n) { return !isNaN(parseFloat(n)) && isFinite(n); } var containWithDomain = function(a,b,callback){ var wrapper = domain.create(); wrapper.on(“error”, function(err){ callback(new Error(“caught by domain”), null); }); wrapper.run(function(){ var response = add(a,b, function(err, response){ callback(err, response); }); }); } ``` Here you can see that our addAsync method now throws errors rather than passing them to the callback. Left uncaught, they would normally crash the application. However, if we use containWithDomain() rather than addAsync(), we can actually catch those thrown errors and handle them in a meaningful way. Domains are great when you don’t know exactly what the underlying code is doing, but should be avoided in favor of one of the first 2 patterns whenever possible. ActionHero users domains to run users’ action and task code within to ensure that one poor action won’t crash the whole app (and how we are able to catch exceptions and render custom error traces as shown in the image above). You can see actionHero’s exception code here. ### Thanks! --- --- url: /blog/post/2019-10-07-failing-a-task.md description: Welcome to the second installment of The Illustrated Actionhero Community Q&A! --- Welcome to the second installment of The Illustrated [Actionhero](https://www.actionherojs.com) Community Q\&A! Every week in October I’ll be publishing a conversation from the [Actionhero Slack community](http://slack.actionherojs.com) that highlights both a feature of the Actionhero Node.JS framework and the robustness of the community’s responses… and adding some diagrams to help explain the concept. ### Failing a Task October 7th, 2019 [Source conversation in Slack](https://actionherojs.slack.com/archives/C04EVSUSD/p1568673475021200) Daniele asks: > Scenario: I have a hero task whose \`run()\` method contains a call to a function returning a promise. If the returned promise gets rejected, I need the task to fail and to be sent to the failed queue. On the docs I saw that throwing an error accomplishes this. My problem is: how to throw an error from a catch statement? I mean: I tried something like: ```js asyncFun() .then(...) .catch(err => { throw new Error('operation failed') }) ``` > but this is going to catch the exception. How can I properly make the task fail given a rejected promise? Thanks First, let’s talk about Tasks. One of the features of Actionhero is that is include a number of features out-of-the box for making your application that go beyond "just running your HTTP API". Tasks are Actionhero’s mechanism for running background jobs. Background jobs are an excellent pattern when you: * Run a calculation on recurring schedule, like calculating high scores * Defer communicating with third party services (like sending emails or hitting APIs) in a way that can be slow and retried on failure * Move some slower work to another process to keep your API responses quick. Actionhero’s Task System is built on the [node-resque](https://github.com/taskrabbit/node-resque) package to be interoperable with similar job queues in Ruby and Python. You can learn more about tasks at A task is defined like this ```js // file: tasks/sayHello.js const {Task, api} = require('actionhero') module.exports = class SayHello extends Task { constructor () { super() this.name = 'say-hello' this.description = 'I say Hello on the command line' this.frequency = 0 // not a periodic task } async run ({ params }) { api.log(\`Hello ${params.name}\`) } } ``` And invoked anywhere else in your codebase like this await api.tasks.enqueue('say-hello', {name: 'Sally'}, 'default') Enqueuing your task will add it to a queue to eventually be worked by any of the Actionhero servers working those queues: ![](/images/medium-export/1__KMPhTzPQSR1js3sK__FZebw.png) Now back to Daniele’s Question. When a Task "Fails", it’s logged, and it is also moved to a special list in Redis called the "Failed Queue". Actionhero and Resque keep the task, it’s arguments, and the exception thrown so you can choose to retry it or delete it. There are plugins you can install to retry a Task a few times if you want, or auto-delete it… but that’s up to you. The [ah-resque-ui](https://github.com/actionhero/ah-resque-ui) plugin does a good job of visualizing this. You can see the exception, the arguments to the job, and when it was run. ![](/images/medium-export/1____Rq1h6E4uvVpkC0H3033LQ.png) The community suggested: > (I think that there are) 2 options: > > 1. don’t catch > 2. use async/await asyncFunc() (and again, don’t try/catch) if you want to modify the error returned in some way. > In your catch block you can format a new error string and throw it again. Fore example, you might have a task that communicates with a third-party API, and you want to make the error message more clear: ```js // file: tasks/sendEmail.js const { Task, api } = require("actionhero"); module.exports = class SayHello extends Task { constructor() { super(); this.name = "send-email"; this.description = "I send an email"; this.frequency = 0; // not a periodic task } async run({ params }) { try { await api.email.send(params); } catch (error) { const betterError = new Error(`could not send email: ${error.message}`); betterError.stack = error.stack; throw betterError; } } }; ``` Elaborating more on option #2: > You can imagine all tasks as already being wrapped in a big try/catch. So what is eventually thrown will be caught and bubbled out to the failed queue in resque. Actions are the same way actually: that’s how we can send a 500 response to the client and not just take down the server Finally, Daniele asked if the return value of the \`run\` method matters: > nope. whatever you return will be logged, but that’s about it there are some plugins/middleware that might care about the return value, but by default it doesn’t matter. I usually like to return a string to be logged… like if I had a nightly task to delete old database records, I might log how many rows were deleted or something… And Devxer added: > It’s work mentioning that the task runner used for testing will return the results of the task, so if you plan to write tests for your tasks you can use the return function to test what might otherwise be "side-effect" results. As your application grows, you will invariably need a framework to process data in the background. Actionhero ships with a scalable Task system you can use from day one. Give it a try! --- --- url: /blog/post/2013-04-14-forklift-moving-big-databases-around-in-ruby.md description: Moving Big Databases Around in Ruby --- ![](/images/medium-export/0__T3e9y__efvaQCz99b.jpg) ### What? [Forklift](https://github.com/taskrabbit/forklift) is a ruby gem that can help you collect, augment, and save copies of your mySQL databases. This is often called an ["ETL" tool](http://en.wikipedia.org/wiki/Extract,_transform,_load) as the steps in this process mirror the actions of "Extracting the data," "Transforming the data," and finally "Loading the data" into its final place. With Forklift, you create a **Plan** which describes how to manipulate your data. The process for this involves (at least) three databases: * Live Set * Working Database * Final Database The "Live Set" is first loaded into the "Working Set" to create a copy of your production data we can manipulate without fear of breaking replication. Then, any transformations/manipulations are run on the data in the working set. This might include normalizing or cleaning up data which was great for production but hard for analysts to use. Finally, when all of your transformations are complete, that data is loaded into the final database. Forklift is appropriate to use by itself or integrated within a larger project. Forklift aims to be as fast as can be by using native mySQL copy commands and eschewing all ORMs and other RAM hogs. ### Features * Can extract data from both local and remote databases * Can perform integrity checks on your source data to determine if this run of Forklift should be executed * Can run each Extract either each run or at a frequency * Can run each Transform either each run or at a frequency * Data kept in the woking database after each run to be used on subsequent transformations * Only ETL’d tables will be copied into the final database, leaving other tables untouched * Emails sent on errors ### What does TaskRabbit use this for? At TaskRabbit, the website you see at [www.taskrabbit.com](https://www.taskrabbit.com) is actually made up of many [smaller rails applications](http://en.wikipedia.org/wiki/Service-oriented_architecture). When analyzing our site, we need to collect all of this data into one place so we can easily join across it. We replicate all of our databases into one server in our office, and then use Forklift to extract the data we want into a common place. This gives us the option to both look at live data and to have a more accessible transformed set which we create on a rolling basis. Our "Forklift Loop" also git-pulls to check for any new transformations before each run. ### Example Annotated Plan In Forklift, you build a plan. You can add any action to the plan in any order before you run it. You can have 0 or many actions of each type. ```ruby require 'rubygems' require 'bundler' Bundler.require(:default) require 'forklift/forklift' # Be sure to have installed the gem! ######### # SETUP # ######### forklift = Forklift::Plan.new({ :local_connection => { :host => "localhost", :username => "root", :password => nil, }, :remote_connections => [ { :name => "remote_connection_a", :host => "192.168.0.0", :username => "XXX", :password => "XXX", }, { :name => "remote_connection_b", :host => "192.168.0.1", :username => "XXX", :password => "XXX", }, ], :final_database => "FINAL", :working_database => "WORKING", :do_dump? => true, :dump_file => "/data/backups/dump-#{Time.new}.sql.gz", :do_email? => true, :email_to => ['XXX'], :email_options => { :via => :smtp, :via_options => { :address => 'smtp.gmail.com', :port => '587', :enable_starttls_auto => true, :user_name => "XXX", :password => "XXX", :authentication => :plain, } } }) ########## # CHECKS # ########## forklift.check_local_source({ :name => 'CHECK_FOR_NEW_DATA', :database => 'test', :query => 'select (select max(created_at) from new_table) > (select date_sub(NOW(), interval 1 day))', :expected => '1' }) forklift.check_remote_source({ :connection_name => "remote_connection_b", :name => 'ANOTHER_CHECK', :database => 'stuff', :query => 'select count(1) from people', :expected => '100' }) ########### # EXTRACT # ########### forklift.import_local_database({ :database => "database_1", :prefix => false, :frequency => 24 * 60 * 60, }) forklift.import_local_database({ :database => "database_2", :prefix => false, :only => ['table_1', 'table_2'], }) forklift.import_remote_database({ :connection_name => 'remote_connection_a', :database => "database_3", :prefix => true, :skip => ['schema_migrations'] }) ############# # TRANSFORM # ############# transformation_base = File.dirname(__FILE__) + "/transformations" forklift.transform_sql({ :file => "#{transformation_base}/calendars/create_calendars.sql", :frequency => 24 * 60 * 60, }) forklift.transform_ruby({ :file => "#{transformation_base}/test/test.rb", }) ####### # RUN # ####### forklift.run ``` ### Workflow ```rb def run lock_pidfile # Ensure that only one instance of Forklift is running rebuild_working_database # Ensure that the working database exists ensure_forklift_data_table # Ensure that the metadata table for forklift exists (used for frequency calculations) run_checks # Preform any data integrity checks run_extractions # Extact data from the life databases into the working database run_transformations # Perform any transformations run_load # Load the manipulated data into the final database save_dump # mySQLdump the new final database for safe keeping send_email # Email folks the status of this forklift unlock_pidfile # Clean up the pidfile so I can run next time end ``` ### Transformations Forklift allows you to create both Ruby transformations and SQL transformations ```rb class Test def transform(connection, database, logger) logger.log "Running on DB: #{database}" logger.log "Counting users..." connection.q("USE `#{database}`") users_count = connection.q("count(1) as 'users_count' from `users`") logger.log("There were #{users_count} users") end end ``` #### Ruby Transformations * SQL Transformations are kept in a file ending in .rb * Ruby Transformations should define a class which matches the name of the file (IE: class MyTransformation would be in a file called my\_transformation.rb * logger.log(message) is the best way to log but logger.debug is also available * database is a string containing the name of the working database * connection is an instance of Forklift::Connection and connection.connection is a raw mysql2 connection * Classes need to define a transform(connection, database, logger) IE: ### SQL Transformations * SQL Transformations are kept in a file ending in .sql * You can have many SQL statements per file * SQL will be executed linearly as it is written in the file SQL Transformations can be used to [generate new tables like this](http://stackoverflow.com/questions/1201874/calendar-table-for-data-warehouse) as well ### Defaults The defaults for a new Forklift::Plan are: ```text 1 { 2 :project_root => Dir.pwd, 3 :lock_with_pid? => true, 4 5 :final_database => {}, 6 :local_database => {}, 7 :forklift_data_table => '_forklift', 8 9 :verbose? => true, 10 11 :do_checks? => true, 12 :do_extract? => true, 13 :do_transform? => true, 14 :do_load? => true, 15 :do_email? => false, 16 :do_dump? => false, 17 } ``` ### Methods #### Test ```text 1 forklift.check_local_source({ 2 :name => STRING, # A name for the test 3 :database => STRING, # The Database to test 4 :query => STRING, # The Query to Run. Needs to return only 1 row with 1 value 5 :expected => STRING # The response to compare against 6 }) 7 8 forklift.check_remote_source({ 9 :connection_name => STRING, # The name of the remote_connection 10 :name => STRING, # A name for the test 11 :database => STRING, # The Database to test 12 :query => STRING, # The Query to Run. Needs to return only 1 row with 1 value 13 :expected => STRING # The response to compare against 14 }) ``` #### Extract ```text 1 forklift.import_local_database({ 2 :database => STRING, # The Database to Extract 3 :prefix => BOOLEAN, # Should we prefix the names of all tables in this database when imported wight the database? 4 :frequency => INTEGER (seconds), # How often should we import this database? 5 :skip => ARRAY OR STRINGS # A list of tables to ignore and not import 6 :only => ARRAY OR STRINGS # A list of tables to ignore and not import (use :only or :skip, not both) 7 }) 8 9 forklift.import_remote_database({ 10 :connection_name => STRING, # The name of the remote_connection 11 :database => STRING, # The Database to Extract 12 :prefix => BOOLEAN, # Should we prefix the names of all tables in this database when imported wight the database? 13 :frequency => INTEGER (seconds), # How often should we import this database? 14 :skip => ARRAY OR STRINGS # A list of tables to ignore and not import 15 :only => ARRAY OR STRINGS # A list of tables to ignore and not import (use :only or :skip, not both) 16 }) ``` #### Transform ```text 1 forklift.transform_sql({ 2 :file => STRING, # The transformation file to run 3 :frequency => INTEGER (seconds), # How often should we run this transformation? 4 }) 5 6 forklift.transform_ruby({ 7 :file => STRING, # The transformation file to run 8 :frequency => INTEGER (seconds), # How often should we run this transformation? 9 }) ``` ### Debug You can launch forklift in "debug mode" with — debug (we check ARGV\[" — debug"] and ARGV\["-debug"]). In debug mode the following will happen: — verbose = true — no SQL will be run (extract, load) — no transforms will be run — no email will be sent — no mySQL dumps will be created ### Options & Notes * email\_options is a hash consumed by the [Pony mail gem](https://github.com/benprew/pony) * Forklift’s logger is [Lumberjack](https://github.com/bdurand/lumberjack) with a wrapper to also echo the log lines to stdout and save them to an array to be accessed later by the email system. * The connections hash will be passed directly to a [mysql2](https://github.com/brianmario/mysql2) connection. Follow the link to see all the available options. ### Limitations * mySQL only (the [mysql2](https://github.com/brianmario/mysql2) gem specifically) ### [Forklift is available now. Enjoy!](https://github.com/taskrabbit/forklift) [**taskrabbit/forklift**](https://github.com/taskrabbit/forklift) --- --- url: /blog/post/2020-10-16-typescript-frontend-backend.md description: Use Typescript to more tightly couple your React apps to your API --- ![Keyboard image](/images/posts/2020-10-16-typescript-frontend-backend/keyboard.jpeg) Two of the major components of the `@grouparoo/core` application are a Node.js API server and a React frontend. We use [Actionhero](https://www.actionherojs.com) as the API server, and [Next.JS](https://nextjs.org/) for our React site generator. As we develop the Grouparoo application, we are constantly adding new API endpoints and changing existing ones. One of the great features of Typescript is that it can help not only to share type definitions within a codebase, but also *across* multiple codebases or services. We share the Typescript `types` of our API responses with our React Frontend to be sure that we always know what kind of data we are getting back. This helps us ensure that there is a tight coupling between the frontend and backend, and that we will get compile-time warnings if there’s something wrong. ## Getting the type of an API Response In Actionhero, all API responses are defined by Actions, which are classes. The `run()` method of the Action class is what is finally returned to the API consumer. Here’s a prototypical example of an action that lets us know what time it is: ```ts import { Action } from "actionhero"; export class GetTime extends Action { constructor() { super(); this.name = "getTime"; this.description = "I let you know what time it is"; this.inputs = {}; this.outputExample = {}; } async run() { const now = new Date(); return { time: now.getTime() }; } } ``` This action takes no input, and returns the current time as a `number` (the unix epoch in ms). The action is also listed in our `config/routes.ts` file as responding to `GET /time`. The next step is to extract the `run()` method’s return type to get the `type` of the API response We can use a helper like [`type-fest`’s](https://www.npmjs.com/package/type-fest) `PromiseValue` to get the return value, or we can do it ourselves: ```ts // from https://www.jpwilliams.dev/how-to-unpack-the-return-type-of-a-promise-in-typescript export type UnwrapPromise = T extends Promise ? U : T extends (...args: any) => Promise ? U : T extends (...args: any) => infer U ? U : T; ``` So, the type of the Action’s response is: ```ts type ActionResponse = UnwrapPromise; // = { time: number; } ``` And in our IDE: ![Display TS types](/images/posts/2020-10-16-typescript-frontend-backend/screenshot-1.png) This is excellent because now any changes to our action will result in the `type` being automatically updated! ## Consuming the API Response Type in React The Grouparoo Application is stored in a [monorepo](https://github.com/grouparoo/grouparoo), which means that the frontend and backend code always exist side-by-side. This means that we can reference the API code from our Frontend code, and make a helper to check our response types. We don't need our API code at run-time, but we can import the `types` from it as we develop and compile the app to Javascript. The first thing to do is make a utility file which imports our Actions and extracts their types. Grouparoo does this in `web/utils/apiData.ts` ```ts import { UnwrapPromise } from "./UnwrapPromise"; import { GetTime } from "../../api/src/actions/getTime"; export namespace Actions { export type GetTime = UnwrapPromise; } ``` This `apiData.ts` will allow us to more concisely reference `Actions.GetTime` in the rest of our react application. Now, to use the Action’s response type, all we have to do is assign it to the response of an API request: ```ts import { useState, useEffect } from "react"; import { Actions } from "../utils/apiData"; export default function TimeComponent() { const [time, setTime] = useState(0); useEffect(() => { load(); }, []); async function load() { const response: Actions.GetTime = await fetch("/api/time"); setTime(response.time); } if (time === 0) return
loading...
; const formattedTime = new Date(time).toLocaleString(); return
The time is: {formattedTime}
; } ``` Now we have enforced that the type of `response` in the `load()` method above will match the Action, being `{ time: number; }`. We will now get help from Typescript if we don’t use that response value properly as a number. Foe example, assigning it to a string variable creates an error. ![TS error](/images/posts/2020-10-16-typescript-frontend-backend/screenshot-2.png) ## Summary Since Typescript is used at “compile time”, it can be used across application boundaries in surprisingly useful ways. It’s a great way to help your team keep your frontend and backend in sync. You won’t incur any runtime overhead using Typescript like this, and it provides extra certainty in your test suite that your frontend will use the data it gets from your API correctly. --- --- url: /blog/post/2021-02-12-gifit.md description: 'With open source tools, you can easily share and embed your screen recordings' --- When building Grouparoo, the Grouparoo team often shares screen recordings of our work with each other. In many cases, the tools we are using (like Github, until recently anyway) could only embed image content into READMEs and Pull Requests. That meant that the humble animated gif was often the best way to share a video. Here is my personal script called `gifit` which uses the open source `ffmpeg` and `gifsicle` tools to make it super easy to convert any video file into an easy-to-share gif! ```bash #!/bin/bash # This script required ffmpeg and gifsicle # On OSX: `brew install ffmpeg gifsicle` SECONDS=0 INPUT_FILE=$1 BASENAME="${INPUT_FILE%.*}" OUTPUT_FILE="$BASENAME.gif" echo "🎥 Converting $INPUT_FILE to $OUTPUT_FILE" # Convert the video to a gif ffmpeg -i $INPUT_FILE -pix_fmt rgb8 -r 10 $OUTPUT_FILE -loglevel warning -stats # Compress the Gif # Reduce the size to 1/2 the original (because we are recording a retina screen) # Tweak the "lossy" argument to add more colors, but increase filesize gifsicle -O3 $OUTPUT_FILE -o $OUTPUT_FILE --lossy=80 --scale=0.5 # How lng did it take? ELAPSED="$(($SECONDS / 3600))hrs $((($SECONDS / 60) % 60))min $(($SECONDS % 60))sec" echo "🎉 Complete in $ELAPSED" ``` Note that on OS X you will need to `brew install ffmpeg gifsicle` first. So, to make the video above, I: 1. Used Quicktime to record my screen 2. Saved the video as `screenshot.mov` 3. Ran `gifit screenshot.mov` and I got `screenshot.gif`! --- --- url: /blog/post/2015-04-17-git-whereami.md description: >- I’ve been traveling a lot for work, and I thought it might be cool to somehow signal where I was physically when checking in code. I knew… --- I’ve been traveling a lot for work, and I thought it might be cool to somehow signal where I was **physically** when checking in code. I knew my Mac has a Geolocation framework, but I couldn’t figure out a way to access it via the command line. Then I found the wonderful [WherAmI](https://github.com/robmathers/WhereAmI) project! It’s a simple wrapper around the [CoreLocation](http://en.wikipedia.org/wiki/IOS_SDK#Core_Location) framework that spits out your lat/lng. From there, it’s a [simple task](https://github.com/evantahler/git-whereami) to pass that up to the Google Geocoding API to get a street address! Here’s the readme from [evantahler/git-whereami](https://github.com/evantahler/git-whereami) [**evantahler/git-whereami**](https://github.com/evantahler/git-whereami) *Append your location to all of our git commits!* Do you travel a lot? Would you team be interested to know where your code is coming from? Then this is for you! #### Install whereami Head on over to and download the whereami excecutable. Place it in your home folder, like ~/whereami ```bash #!/bin/bash # prepare-commit-msg WHEREAMI="$HOME/whereami" LAT=`$WHEREAMI | grep Latitude | awk -F" " '{print $2}' | awk '{print $1}'` LNG=`$WHEREAMI | grep Longitude | awk -F" " '{print $2}' | awk '{print $1}'` URL="http://maps.googleapis.com/maps/api/geocode/json?latlng=$LAT,$LNG&sensor=false" ADDRESS=`eval "curl -s \"$URL\" | grep formatted_address | head -n 1 | sed 's/\"//g' | sed 's/,//g'"` ADDRESS=`echo $ADDRESS | awk -F' : ' '{print $2}'` DATE=`date` printf "\n" >> $1 printf "This commit coded at:\n" >> $1 printf "---------------------\n" >> $1 printf "$ADDRESS\n" >> $1 printf "$LAT, $LNG\n" >> $1 printf "@ $DATE\n" >> $1 ``` #### Setup whereami In whichever git repository you want to use this on, copy the prepare-commit-msg into ~/PROJECT/.git/hooks. **That’s it!** Now, whenever you make a git commit, we will use whereami to source your lat/lng, and then ask Google’s geocoder what your address is, resulting in: ```text # Please enter the commit message for your changes. Lines starting # with '#' will be ignored, and an empty message aborts the commit. # On branch master # Your branch is ahead of 'origin/master' by 4 commits. # (use "git push" to publish your local commits) # # Changes to be committed: # new file: newfile # This commit coded at: --------------------- 63 Hanbury Street London E1 5JP UK 51.520182, -0.070440 @ Fri Apr 17 15:57:03 BST 2015 ``` Feel free to tweak the template as you wish! #### Notes: * OSX Only! --- --- url: /blog/post/2012-01-14-githire.md description: >- David Padilla of Crowd Interactive recently realized he was on this recruiting site called GitHire. They use algorithms to inspect your… --- ![](/images/medium-export/1__81HtnBzez5ptxePVwSMhEg.jpeg) David Padilla of [Crowd Interactive](http://crowdint.com/) recently realized he was on this recruiting site called [GitHire](http://www.githire.com). They use algorithms to inspect your public GitHub repositories to determine how good of a programmer you are. Aside from the fact that this is a limited approach to "skill" which assumes that you use GitHub, I think that this tool has one strong intrinsic problem. This problem is worth talking about because I think that it is reflective of the whole "Startup Culture" I find myself in today. #### The problem: Adoption is a measure of skill. With GitHub, you can follow folks, watch and fork repos, etc. GitHire makes the point to show followers and forks on profile pages, so I am assuming that those factor into their algorithm. The job of an engineer is to get shit done and make products / solve problems. The job of business development is to identify those problems, rank them meaningfully, get those products used (and presumably paid for). We live in strange times where those two skill sets are merging. Of course, I certainly fall into the description of "engineer who also does business development". *Hell, I’m about to attempt* [*another startup*](http://delicioushat.com) *(more coming soon on that topic). I’m going to ignore my obvious hypocrisy for now, but I will address is later* When I think about hiring, I’m very tempted to fall into this trap. "Have you worked on any open source packages I have heard of?" or "Have you patched some important codebase?" are questions I need to force myself not to think. When hiring an engineer the right questions might be about "Scale", but they should remain technical in nature. Interview questions can remain about infrastructure and uptime in the context of userbase but not as a source of it. If 10K people downloaded your package from a sharing site you don’t maintain, that doesn’t tell me anything other than that you are really good at identifying engineering problems, but tells me nothing of your skill in the implementation. If 1K of those people who downloaded your package then forked it, that tells me you made a great start at solving the problem, but didn’t address the needs of (at least) 10% of your audience. You are a thought leader, you are a scientist. You might still be a terrible engineer. There’s an important footnote to add here about framework / API development, which I believe might be an exception to the above. If your job is to create tools for other engineers to use, then adoption is a valuable measure of skill. The reason Rails is more popular than Merb has a lot to do with usability and distribution patterns. However, I don’t think that most engineers are hired with such a strong external focus. No engineer can work in a vacuum and other folks will need to interact with your code, but strong modularity is normally enough of an interaction and true APIs and DSLs aren’t needed. Back to my Hypocrisy. When I’m applying for Engineering jobs, I find it best to narrow the scope of my accomplishments to just those of engineering. No engineering hiring manager really should care about how much money I saved the company as a Product Manager. They should care about how my architecture decisions allowed us to reduce the load on our servers, and scale down our hosting environment. See the difference? **TlDR**; Good engineers are lazy. They automate code that writes code for them. In true unix fashion, good engineers are [bricoleurs](http://en.wikipedia.org/wiki/Bricolage) and their job is to build on the work of others, rather than reinventing the wheel each time. I don’t know how counting my lines of code and forks of my repo measures this. Oh, and [David](http://www.githire.com/profiles/dabit) is in the top 2% (In Colima MX) and [I’m in the top %5](http://www.githire.com/profiles/evantahler) (In San Francisco) :D --- --- url: /blog/post/2012-10-04-gitHub-resume.md description: >- This site had the wonderful idea to generate resumes from Github. I like this idea. The more time passes, the more invaluable to the… --- ![](/images/medium-export/1__Rkwpr__BRt7ocxaqxnjnc9Q.jpeg) [This site](http://resume.github.com/) had the wonderful idea to generate resumes from Github. I like this idea. The more time passes, the more invaluable to the ecosystem gitHub becomes. I would invest now. This resume is all client-side JS, and updates on the fly. [Here’s mine](http://resume.github.com/?evantahler). --- --- url: /blog/post/2019-10-19-github-sponsorship.md description: My dream of working on Actionhero full time --- ![](/images/medium-export/1__CrF__V9wBPsZJBFMzr69gQA.jpeg) Hello Actoionhero Community! I’m a little embarrassed to be asking, but I’ve been accepted to be a beta member of Github Sponsors  . This is a new program where you can have a "sponsor" button on your projects and folks can donate to maintainers; like a Patreon for developers. I’m honored to have made the cut for this beta, and it’s largely because of Actionhero. I’ve been the primary maintainer for over 7 years now, the community manager, have been paying all the bills, etc. I’ve put in real time and money to the project. While it has been incredibly rewarding, it has also had a real cost. *Would you consider sponsoring me to continue to work on it?* I’ve decided to structure my sponsorship tiers around support, which you can see here . Basically, I’m trading my time for "priority support", in exchange for sponsorship. Nothing will change with regards to how I develop Actionhero, but if you or your company uses Actionhero or node-resque and you would like some help getting started, are curious about how to harden your production deployment, or would like custom plugins or servers, this is a great way to get it! If you are on the fence about donating, now is the best time to do it. Github is matching sponsorships up to $5,000USD to help publicize the program. If I could make a living working on Actionhero, it would be a dream come true! Thank you! **Vist** [**https://github.com/sponsors/evantahler**](https://github.com/sponsors/evantahler) **to start the sponsorship process.** ![](/images/medium-export/1__OzxRp254ukPtqSs4Bg3v__Q.png) --- --- url: /blog/post/2016-05-14-good-old-games-osx-resolution.md description: I’ve been itching for some nostalgic games lately. --- I’ve been itching for some nostalgic games lately. ![](/images/medium-export/1__EL__0uaqprD23nf__HtsqZFQ.jpeg) [Homeworld was "remastered"](http://store.steampowered.com/app/244160/) recently, and you can get it on Steam. It’s still great. [Dungeons II](http://store.steampowered.com/app/262280/) seemed to be the spiritual successor to [Dungeon Keeper 2](https://en.wikipedia.org/wiki/Dungeon_Keeper_2), but I found that it didn’t quite deliver… the imps were too slow, the "overground" didn’t really do much except to slow down the pace of the game (rather than create a challenge). I thought to myself… I wonder if I can still play my old games? Would they still be just as fun? Enter "Good Old Games" (or [GOG](http://gog.com)). This company takes games from 5+ years ago, tweaks them to work on "modern" hardware, and sells them for cheap! A select few have even been ported to OSX! Since these were older windows games, I decided not to go full "boot camp" on OSX, but rather use a windows VM on VMware Fusion. I purchased windows 8 a while back, but now that Mac laptops don’t come with DVD drives… how was I supposed to install it? It turns out you can get windows ISOs from Microsoft directly now! [https://www.microsoft.com/en-us/software-downloa](https://www.microsoft.com/en-us/software-download/)d. This is a huge departure from the old "physical media only" Microsoft I had last been in contact with, circa 2008. I downloaded the install ISO and away I went! If you don’t have a Windows key, there are also ["test" VMs Microsoft releases](https://developer.microsoft.com/en-us/microsoft-edge/tools/vms/mac/) that work for a few weeks… With Windows running in my VM, I bought and downloaded [Dungeon Keeper II](https://www.gog.com/game/dungeon_keeper_2) from GOG, and… it kept crashing! I deleted, re-installed, repaired… it kept crashing. After Googling fruitlessly, a random idea struck me: I Read up on what resolution the game supported, and set my screen resolution at that… it worked! > It looks like many of the older games sold by GOG will \*only\* work if your system resolution is 1024x768. Thinking about this, it makes sense. Back in the 90’s, screens were always square, and rarely higher than this resolution. Even if the game supported changing the system resolution at boot, I’m sure even *checking* a 16x9 resolution might crash things. Hopefully I’ve saved you a few hours! I’m off to build my dungeon… ![](/images/medium-export/1__qu5vaRA9sU8nVCxa__6yHcQ.jpeg) --- --- url: /blog/post/2018-01-04-goodbye-scoreboard-guru.md description: Feb 1 Scoreboard Guru shuts down --- **Summary:** * **The Scoreboard Guru application is now removed from iOS App Store.** * **Scoreboard Guru will be shutting down on February 1st.** * **You can export your game data as a CSV from within the application until this date.** * **Those of you who paid to unlock the application will be refunded. Email** [**support@scoreboard.guru**](mailto:support@scoreboard.guru) **so we can arrange the refund.** ![](/images/medium-export/1__E6ZIx0OxnxDZ6dWw4o10og.png) #### Game Over Hello Scoreboard Guru Players. I am sad to report that at the end of the month, I will be shutting down Scoreboard Guru. I built this application to fill a need that I (and my family & friends) had while playing board games. I wanted to build a synchronized, social record of all the games we’ve played to compare and challenge each other… And we did it! For the ~10 months that Scoreboard Guru was out in the wild, we **had a few hundred users score a few hundred matches in over 50 distinct games!** However, I’m just one person, and the cost of running the Scoreboard Guru servers is not being covered by the money I thought we might make by selling the full, unlocked application (which was a $1.99 in-app purchase). This also doesn’t cover my time to keep improving the app, servers, and keeping both bug-free as we add more features. *So, after 11 months of giving it a try, it is time to sunset* [*what I had hoped*](https://blog.evantahler.com/scoreboard-guru-initial-release-14aeec2c66bf) *would become a popular gaming application.* As of now, no new users can download the Scoreboard Guru application, as it has been removed from sale in the iOS app store. If you already downloaded the app in the past, you can restore it. If you are interested in downloading your scores to move to another platform, you can export your data as a CSV from within the app, and it will be emailed to the address on your account. You must do this before February 1st, as that is the day the Scoreboard Guru servers will be shut down. #### Refunds There is no way for me to issue a refund for an in-app purchase via the iTunes/iOS ecosystem. Apple treats their users’ privacy with respect, and that includes not sharing your contact information with me, the app developer. I want to refund those of you who paid to unlock the full application. Please email and we will coordinate a refund of your $1.99 via another method. You will need to provide the email address you used to register Scoreboard Guru with so I can verify the purchase. #### What will you do with my data? All of the data we’ve captured and you have provided to Scoreboard Guru will be destroyed and never shared with anyone. On February 1st, I will be deleting all of the Scoreboard Guru data. This means that our database (with your games, matches, notes, and friends) will be totally deleted. No backups will be kept. All the content you have uploaded (images of your games and matches, along with the detailed records of your matches’ scores) will be deleted, and no backup will be kept. The vendors used by Scoreboard Guru (Heroku, Amazon Web Services, Apple, SendGrid, and Cloudinary) all have excellent policies when it comes to deleting data, and I trust them to instantly comply with my request to destroy it all. ### Thank You This post has also been sent to all registered users of the Scoreboard Guru app. --- --- url: >- /blog/post/2012-02-24-hacked-or-a-reminder-about-why-unix-permissions-matter.md description: >- Note: I have included links to some of the malicious PHP code which attacked me in this blog post. I had a hard time deciding whether this… --- ![](/images/medium-export/1__Uub7lw9wYczHU7v__vP8WaA.jpeg) **Note:** *I have included links to some of the malicious PHP code which attacked me in this blog post. I had a hard time deciding whether this was ethical or not. Seeing as us "good guys" need to know what to look for (and be aware of the crazy shit hackers are capable of) and realizing that the bad guys already had all this code, I don’t think I’m doing any additional damage. If you disagree, please let me know.* I want this blog to be as insightful as possible to other nerds, so I’m going to explain how I made **a n00b mistake** which ended up costing me an entire day to fix… and will hopefully be a reminder to you all out there in the internet to not do the same. This is also a great excuse to deconstruct the hack I fell prey to and to understand the elaborate ways these things work. To any of you who have visited this blog in the past few weeks and ended up redirected to a shady foreign porn site, this would be why :/ First the basics: This is a [WordPress](http://blog.evantahler.com/wordpress.org) blog (newest version, all pluggins and themes up-to-date) hosted by the fine folks at [DreamHost](http://dreamhost.com/). I’ve been a DreamHost customer for over 5 years now, and for commodity file and small-site PHP hosting, they are the best. Today I’m thankful for their speedy customer support. I’ve got a shared account, which means I share a server with n other customers. There are no long-running processes allowed (other than Ruby Passenger processes) which means that everyone on this sever is doing "basic" web hosting with the occasional cron-job. That’s what I’m doing too. Yesterday I thought that while visiting my site I was quickly redirected to some other site. I assumed it was user error and forgot about it. Then it happened again today, so I decided to investigate, as the website I was taken to was one I had never seen before featuring porn in another language. ![](/images/medium-export/1__TwyKZggmPZ6nLi30ImDNPg.jpeg) The first thing I checked was the database, looking for JS and iFrame tags in the comments, but there was nothing there. Then I looked though the posts and other wordpress tables looking for anything out of place, but still, nothing. Then I wondered if someone was able to break my caching scheme, so I turned off all the cache pluggins I had. Nothing. Finally, I decided to take a peak at the actual PHP files which run this blog and… bingo! There, at the top of EVERY PHP page, was a mysterious base-64 encoded string. ```php ``` For those of you curious as to what the base64 encoded string revealed, I’ve created a [GitHub gist for it](https://gist.github.com/1891398) here (with some formatting to make it legible). The only 3 remote URLs in the entire script are: http://sweepstakesandcontestsdo.com/, http://www.lilypophilypop.com/, http://www.lolypopholypop.com/ (these links are not clickable on purpose… don’t visit them). The lolly sites return a list of other URLs, the ones which users of the site are eventually taken to via the malicious JS. The main site, "sweepstakes", has no content, but I assume that is because I’m not being redirected from a previously compromised site. The script was even fancy enough to cache its list of bad URLs locally to reduce load time… how nice! Here is the troubling part about all of this: The blog database is fine, so the attacker likely gained access to the filesystem directly. I quickly updated my account password and notified Dreamhsot, but I was unwilling to believe that I let my password slip to anyone… and I was also unwilling to believe that someone was malicious enough to target this low traffic blog and to go though the trouble to brute-force their way into my account. With a small ssh failed-attempt-lockout limit, that would have taken a very long time. As final proof that the attackers did gain file-system access, I’m sad to say that even non-wordpress pages were ruined. My photo gallery for instance (evantahler.com/gallery), which has no upload or dynamic features, also has the base64 bad string in it. So… how did this happen? It was about this time that I got a message from Dreamhost support letting me know that they had run an automated check on my site looking for malicious code (essentially searching for more wacky base-64 encoded php strings appearing at high frequency and having a recent modification time) and they found more instances of the hack in other mini-sites and sub-domains I have. Uh Oh. I started looking randomly though my directories when I noticed that a directory which should only have video files in it had a a PHP file in it, "r.php". This file had some different base-64 code in it, but it did something different. This script was double-encoded, first in base-64, and then char-split and needed to be processed back into letters before exec’d. [Here is the "decoded" result](https://gist.github.com/1895530). This script creates a secret upload page which takes the uploaded file and then runs it. Note that there are special helpers to both execute a SQL script and a php script. There are also methods to create more files of this same type of fancy base-64 encoding. Now this script alone won’t do anything bad itself, but does basically allow anyone to execute any script they want as the webserver-user (me). Along with this "r.php" file, there were 2 other files one of which was [essentially the same as this backdoor when decoded and deflated (stackOverflow)](http://stackoverflow.com/questions/3328235/how-does-this-giant-regex-work). Yep. I was 100% Pwned. Now that there was a public way to execute arbitrary code and a nice graphical FTP program, an attacker would be able to upload whatever they wanted, and do whatever they wanted. [There were even new .htaccess](https://gist.github.com/1895629) files which took all web-crawlers to more foreign porn sites and hijacked 404 pages. Yes, it really killed my SEO. Ok, so how did this all start? Remember our friend "r.php" from before? Well, it looks like I wasn’t the owner of that file! A different unix user was the owner of that file while I remained the owner of every other file and directory there. So how did he gain access? That video directory was chmod’d to 0777, meaning that everyone who had an account on that server could read/write/exec anything in that directory. Oops. Now we have enough of a story to build a timeline: Here’s the order of events as I see them: * I had with a world-writable directory in one of my accounts * Another Dreamhost customer’s account was compromised and was executing a script which went looking for any other accessible files/folders on the server (brute-force) and found my account * Once the script found my account, it copied the back-door upload script * From there, other files could be uploaded/run as my user, and this then spread to all my files, modifying PHP files and creating new .htaccess files * From my account, the script probably went off in search of other open directories again, and the cycle continues All of this was possible because of that one open directory. The moral of the story (tldr;): It wasn’t WordPress’ fault, it wasn’t Dreamhost’s fault. It probably wasn’t even the other Dreamhost customer’s fault (I’ll bet his account was compromised just like mine was). It was my fault for being a n00b and having a 0777 folder. ### Never chmod 0777 anything, especially on a shared server. --- --- url: /blog/post/2021-06-11-heroku-slack-notifications.md description: Get a Slack notification when your Heroku App Deploys! --- At [work](https://www.grouparoo.com), we use Heroku to deploy our staging servers. We also use Slack to centralize all of our notification and monitoring. The one part of out stack that we couldn't get into Slack was staging deployments... so I built a Heroku Buildpack to do just that! ![/images/posts/2021-06-11-heroku-slack-notifications/buildpack.png](/images/posts/2021-06-11-heroku-slack-notifications/buildpack.png) When you enable it, you'll get messages like this: ![/images/posts/2021-06-11-heroku-slack-notifications/slack.png](/images/posts/2021-06-11-heroku-slack-notifications/slack.png) See the code and learn more at --- --- url: >- /blog/post/2022-12-19-build-software-products-faster-by-thinking-like-a-data-engineer.md description: Things I learned at Airbyte --- ![Octavia holding tools](/images/posts/2022-12-19-build-software-products-faster-by-thinking-like-a-data-engineer/image.jpg) For the majority of my career, I would describe the work I did as “full stack web development”. I’ve made websites, integrated with warehouse and billing systems, built mobile apps, and integrated with a ton of third-party APIs. Now that I’ve joined Airbyte and have become familiar with the [Modern Data Stack](https://glossary.airbyte.com/term/modern-data-stack/), I’ve realized that I could have saved a lot of time reinventing the wheel. Don’t be like me - build your products faster by thinking like a Data Engineer! I’ve identified 3 types of projects that are a great fit for off-the-shelf Data Engineering tools: * Bulk Third Party API Integration * Timestamps and Audit Trails * Slow & Sequential Tasks The requirements for almost any tool I select for a production system are: * Open Source or otherwise free to use at a small scale * Easy to run (e.g. “docker compose” or a single process to monitor) * Expose monitoring and uptime hooks for “productionizing” them Lucky for us, tools that satisfy these requirements are easy to find! ## Bulk Third-Party API Integration When I worked at TaskRabbit, we had a project to gather all the data we had about the Taskers in one place to make our Customer Service team as powerful as possible. This meant combining our application data with information from Zendesk, Stripe, and other places, which we then used for customer support and triage. At the time, we allocated weeks and built new services and jobs to consume the Zendesk & Stripe APIs to populate tables in our production database. It worked fine, except for all the hiccups and rate-limits that consuming bulk APIs brings… So then we built retry mechanisms, alerting tools, and whatever else we needed to be sure that things were running smoothly. Or, I could have just used [Airbyte](https://airbyte.com/) - a custom-made tool for importing everything an API has to offer, and already handles rate-limits and retries, to load data into our database. It’s a bit strange to think about using your database as the integration point, bypassing any application code, but most of the time, the data you get back from the API is all you want anyway. ## Timestamps and Audit Trails I’ve been involved in many projects to better understand how data was changing in our production systems. Sometimes we wanted to create an audit trail of who did what and when, but other times we just wanted to know when something changed. If tying a change to a user of your application is important, there are great plugins like [PaperTrail](https://github.com/paper-trail-gem/paper_trail) for Rails for this… but if you are only interested in the change itself, check out [CDC](https://airbytehq.github.io/understanding-airbyte/cdc/)! Change Data Capture (CDC) is the process of reading the database’s own changelogs and storing those in another location. Without adding any additional rows, triggers, or application code, you can see *every* update, insert, and delete as an event you can then process. From there, you can trigger whatever recalculation, cache flush, or alert you need! Just be ready for a lot of data - it’s a firehose! ## Slow & Sequential Tasks AKA - “No more nested Cron Jobs”. How many times have you crafted long-running jobs that had complex dependencies or side effects? I’m not talking about the quick background jobs that might exist in your product (e.g. drip campaigns or sending SMS messages - keep using Resque or Kafka), but the slower kinds of jobs that tend to land in the Infrastructure side of the house. An example from my past is “make a nightly backup of this database, upload it to 2 different locations, and then send a slack message that it’s done”. This is, of course, a [DAG](https://glossary.airbyte.com/term/dag-directed-acyclic-graph/), and Data engineers deal with the problem of “[orchestration](https://airbyte.com/blog/data-orchestration-trends#what-is-data-orchestration)” all the time. Tools like [Dagster](https://dagster.io/) and [Airflow](https://airflow.apache.org/) are great at this work! They provide easy ways to shell out and run scripts or hit APIs and provide UIs and metadata endpoints to monitor your jobs, time, and retry them… all with far more visibility than my cron scripts of yore. ## Summary Take a look at what the Modern Data Stack includes - it’s [huge](https://www.moderndatastack.xyz/categories)! I’m sure there are more examples of tools that Data Engineers have built that might apply to other engineering disciplines. What can you think of? --- --- url: /blog/post/2012-07-31-i-am-a-nerd.md description: >- I received some stickers at work today from Joyent and, like I have been doing for the past many years, I stuck them to the back of my… --- ![](/images/medium-export/1__Eo__rHpd5eF3dfGuai__2__Zw.jpeg) I received some stickers at work today from [Joyent](http://joyent.com/) and, like I have been doing for the past many years, I stuck them to the back of my clipboard. I realized today that I am a huge nerd. --- --- url: /blog/post/2023-08-29-introducing-airbyte-destinations-v2.md description: 'Making Database destinations faster better, and stronger' --- ![v2!](/images/posts/2023-08-29-introducing-airbyte-destinations-v2/image-1.png) We're excited to announce the public availability of improvements to the way data is synced and handled in destination tables (previously known as normalization). This is Airbyte Destinations V2, which starting today provides: * One-to-one table mapping: Data in one stream will always be mapped to one table in your data warehouse. No more sub-tables. * Improved per-row error handling with \_airbyte\_meta: Airbyte will now populate typing errors in the \_airbyte\_meta column instead of failing your sync. You can query these results to audit misformatted or unexpected data. * Internal Airbyte tables in the airbyte\_internal schema: Airbyte will now generate all raw tables in the airbyte\_internal schema. We no longer clutter your desired schema with raw data tables. * Incremental delivery for large syncs: Data will be incrementally delivered to your final tables when possible. No more waiting hours to see the first rows in your destination table. ‍ Destinations V2 is now available in the latest versions of our Snowflake and BigQuery destinations. Over the next few weeks, Destinations V2 will be rolled out to many more connectors, including Redshift, Postgres, and more. See this guide to learn how to upgrade your connectors, or check out an [example of Destinations V2](https://docs.airbyte.com/understanding-airbyte/typing-deduping#destinations-v2-example). ![v2!](/images/posts/2023-08-29-introducing-airbyte-destinations-v2/image-2.png) ## Audit Content Errors with \_airbyte\_meta ![v2!](/images/posts/2023-08-29-introducing-airbyte-destinations-v2/image-3.png) Airbyte now separates data-moving problems from data-content problems. Prior to Destinations V2, both types of errors were handled the same way: by failing the sync. Now, a failing sync only means that Airbyte could not move all of your data. This is a more flexible approach, as you can now decide how to handle rows with content problems on a case-by-case basis. Per-row error handling also enables you to query the [\_airbyte\_meta column](https://docs.airbyte.com/understanding-airbyte/typing-deduping#_airbyte_meta-errors) to see which rows failed for content reasons, and why. The types of errors which will be stored in \_airbyte\_meta.errors include: * Typing errors: the source declared that the type of the column id should be an integer, but a string value was returned. * Size errors (coming soon): the source returned content which cannot be stored within this row or column (e.g. a [Redshift Super column has a 16mb limit](https://docs.aws.amazon.com/redshift/latest/dg/limitations-super.html)). Destinations V2 will allow us to trim records which cannot fit into destinations, but retain the primary key(s) and cursors and include "too big" error messages. ## Be in Control of Connector Upgrades ![v2!](/images/posts/2023-08-29-introducing-airbyte-destinations-v2/image-4.png) With Destinations V2, we are also excited to announce tooling to help you manage connector updates with breaking changes. Moving forward, whenever there are upcoming breaking changes to one of your connectors, you can now: * Update your source / destination connectors one at a time. This provides an easy path for dual-writing to multiple destinations at once, or testing out new updates before committing to them all at once. * (Airbyte Cloud) Upgrade at your own speed. You will be notified when there are breaking changes, then can choose to opt-in at the time of your choosing (within a window) from the Airbyte UI. You can see [here](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2#breakdown-of-breaking-changes) the breakdown of breaking changes with Destinations V2. The [quickest path to upgrading](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2#quick-start-to-upgrading) updates all connections tied to your destination in-place, without ever resyncing your historical data. We also have [additional upgrade paths](https://docs.airbyte.com/release_notes/upgrading_to_destinations_v2#advanced-upgrade-paths) available for dual-writing to multiple destinations, or testing out the new format of data - also never requiring you to resync historical data. ## New Possibilities with Destinations V2 Destinations V2 provides the Airbyte team with many new avenues for improving the speed, cost, and effectiveness of how data is replicated to your data warehouses, including but not limited to: * Improving the robustness of schema evolution to new data types * Retaining JSON data from database sources after replication * Reducing cost of typing & deduping on the data warehouse * Extending typing & deduping to new destinations Stay tuned for more information on these releases in the coming months. As always, you can consult our [public roadmap](https://github.com/orgs/airbytehq/projects/37/views/1?pane=issue\&itemId=32661141) for more detail on what’s coming next! --- --- url: /blog/post/2012-08-02-it-has-been-so-long.md description: >- I’ve been pretty bad at posting lately, but that’s not to say that I haven’t been busy! --- ![](/images/medium-export/1__LzV__pBMrIVln2B0SicLS8Q.jpeg) I’ve been pretty bad at posting lately, but that’s not to say that I haven’t been busy! Since we last spoke, I’ve joined [TaskRabbit](http://taskrabbit.com) as their senior systems engineer. I get to play with ruby, node, cloud hosts (Joyent), BI, and everything in between. I’ll have more to say soon! [Oh, and if you want $10 off of a task, click here](https://www.taskrabbit.com/PAL/307359) :D --- --- url: /blog/post/2012-12-23-jekyll-markdown-and-more.md description: 'Jekyll: Where have you been all my life?!' --- ![](/images/medium-export/1__fK7ebyH3YbJsrcRvBR3pIQ.jpeg) ### Jekyll: Where have you been all my life?! [Pablo](http://davemode.com/) turned my on to [Jekyll](http://jekyllrb.com/), a static-ste generator written in ruby. I have been hearing about Jekyll for a year now, but I had never really had given it a shot until today. I had been using Wordpress for my blogging needs at [work](http://taskrabbit.com) and for my personal blog, I had tried tumblr and SquareSpace, and I would rate them all as *adequate*. I had spent a long time tuning my Wordpress installations to where I wanted them to be, and was pretty content to keep toiling away at it. I’m going to give Jekyll the highest complement I can give to a piece of software: **After trying it for a few hours, I threw out wordpress entirely and started over. In ~10 hours I had a completely new blog and it is superior in almost every way.** Hey, you are reading this on my new Jekyll blog; Look at that! ### The Good: #### Static Sites & Templates I’ve been, lets call it *raging* about the need to revert to static sites for a while now. * Static sites enforce good [API-first](http://api-first.com) design patterns * Static sites free up huge chunks of ram from your servers when you don’t need to render views and partials, saving money and time * Static sites can be 100% CDN’d, making your users happy However, while you can have sites which render partials and templates client-side, this usually causes some wacky user experiences, and requires modern-esque browser. Pre-compiling static HTML pages is a great option, which allows all the fancy new tempting tools to be used, and adds no server load. Jekyll doesn’t have *all* the options rails folks would be used to, but it has a healthy collection. You can have nested templates, partials, etc. #### Hosting It’s static HTML, so you can host it anywhere! You can render your site locally (automatically even), and rsync your directory. Done. [GitHub](http://github.com) has free static site hosting, and is even Jekyll aware (they made it), and will compile your site *for* you. Just git push. Right now, I’m hosting this on Github. However, you can’t run custom plugins, and I might go back to DreamHost if I have that need. Settings up a post-commit git workflow is not hard, seeing as there are only 2 commands needed: #### Markdown I’m in love with markdown. To me, it strikes the magic balance between being an expressive textile language and bing code-aware. Typing code into Wordpress was a HUGE PAIN. There are quite a few options to render code on your pages, but entering always had problems. Markdown was made for code display. This post was written in markdown. If you are composing markdown on OSX, I recommend the [MOU](http://mouapp.com/) editor. It’s free. #### No Database Required I’m an Ops Guy, and so I’m not scared of hosting any DBs, but having fewer moving parts is always a positive, and it allows more flexible hosting options. #### It’s the web, stupid With Jekyll, you are writing Markdown and HTML. There’s no need for a complex module or plugin system. You want to add Google analytics to your site? Make an \_include with the JS pasted in. Want to use Twitter Bootstrap? Make a CSS and JS folder, and paste the code in. Simple. #### Git You don’t have to keep your Jekyll site in Git, but you should. My revisions are my git history. I can edit things on Github on the web. Hell, I can pull-request articles into my blog if I ever want to. Git is good. #### Local Development is Easy and Encouraged You *can* run wordpress locally, but who does? You can develop Tumblr themes offline, but I don’t know many people who do. With Jekyll, you are encourged to expermint locally with an auto-refreshing server and quick compile times. ### The Bad #### Importing There are [built in ways to migrate from Wordpress](https://github.com/mojombo/jekyll/wiki/blog-migrations), however, not a single post of mine was transformed properly. Categories became Tags, code samples caused the parser to exit early, and images weren’t exported. I spent about 5 hours cleaning up and manually converting my old posts to markdown, and I’m sure there are still errors (posts with Objective-C seemed to break the most often). I also lost all my comments. I could have exported them, but Jekyll is a static site, and by definition, commenting doesn’t make sense. However, I still **want** comments on my blog, so I opted to use [Disqus](http://disqus.com/), and we will see how it goes. #### Loss of comments & code **I lost all of my blog’s previous comments**. I realized too late that I could have switched to the disquss Wordpress plugin had it import my existing comments, and then put in my Jekyll blog. Oh well. Sorry community I had built :/ #### Hard to debug I had a few "pull my hair out (assuming I still had hair)" moments with Jekyl: * You can’t have : in you YML payload without escaping * If you have other character sets than UTF8, bad things will happen * on OSX, the default HDD is case-insensitive. Changing postName to post name confuses Jekyll * Different OS’s render dates in ruby differently (GitHub vs OSX) * If you have un-parsable files, you probably won’t get a warning and the old version of the file will be left in your \_site folder, conniving you that everything is working OK. Jekyll needs some better docs :/ #### Widgets "Widgets" (if you are coming from Wordpress) need to be built entirely in JS. I built a ["featured image" widget](https://github.com/evantahler/evantahler.github.com/blob/master/_layouts/post.html#L42-L68) for my posts, [a "category" page](https://github.com/evantahler/evantahler.github.com/blob/master/categories.html), and [some other features](https://github.com/evantahler/evantahler.github.com/blob/master/_includes/pagination.html) to flesh out my blog. I am MUCH happier writing these in JS than PHP, but most of these are very visible "hacks", which usually involve dumping data from Jekyll to JS, and then handling the logic. Such is life **Overall, Jekyll is awesome. The end.** --- --- url: /blog/post/2019-08-09-keep-that-vpn-connected-on-osx.md description: >- I recently found myself traveling regularly, and I wanted to ensure that no matter when I opened my laptop, that my connection would be… --- ![](/images/medium-export/1__8DMeHvOIF__nDLjL4TYrk9A.jpeg) I recently found myself regularly traveling, and I wanted to ensure that no matter when I opened my laptop, that my wifi connection would be secure. However, there’s no way using built-in OSX VPN client to connect on boot or wake-from-sleep, nor is there any way retry after the failure of your VPN connection. *Good thing that Apple made AppleScript!* I have a personal VPN server running on a 5$/mo [Digital Ocean](https://www.digitalocean.com/) server which is configured with this amazing script: . I’ve got my VPN configured in the MacOS Network settings as `vpn-evan` and I’ve got "all network traffic" going though it. That ***should*** keep me safe… ### Create the Reconnection Script Open `Script Editor` and paste in the following: ```applescript on idle tell application "System Events" tell current location of network preferences set VPNService to the service "vpn-evan" -- replace this with the name of your VPN connection if VPNService is not null then if current configuration of VPNService is not connected then beep beep beep connect VPNService end if end if end tell end tell -- in "idle" blocks, the number returned is how long to sleep until running again return 60 end idle ``` Let’s break dow this script: 1. When the application is idle 2. Find your VPN connection 3. If it’s disconnected, beep at us (so we know what’s happening), and then try to connect 4. Sleep for 60 seconds and check again Be sure to replace "vpn-evan" above with the name of your VPN connection So if this program is always running in the background, every minute, you will try to connect to your VPN! ### Configure the Application to Run at Boot Now, we want to turn this little script into a program. **1.** In \`Script Editor\`, go to "export" and save your script as an "application". Click "Stay open after Run Handler" ![](/images/medium-export/1__dCHUfz9YeBrasiBIciXQcw.jpeg) **2.** Open up \`System Preferences\` and then navigate to "Users" and "Startup Items". Drag and Drop your new application there! ![](/images/medium-export/1__Gc5Sia3PbwpNoLGYD__RWJw.jpeg) That’s it! Thanks to [http://osxdaily.com/2016/08/10/auto-connect-vpn-mac-boot-login](http://osxdaily.com/2016/08/10/auto-connect-vpn-mac-boot-login/]%28http://osxdaily.com/2016/08/10/auto-connect-vpn-mac-boot-login/%29) --- --- url: /blog/post/2012-01-21-life-manifesto-2012.md description: >- A Life Manifesto is a collection of short "bullet points" which describe what your goals in life/love/society are and how you might… --- ![](/images/medium-export/1__WWcPwkbpWVC8MhnluUephA.jpeg) A Life Manifesto is a collection of short "bullet points" which describe what your goals in life/love/society are and how you might approach those goals. Think of it like a mini-mission-statement. Here is my life manifesto for January 2012. This will evolve over time. Writing this type of document was incredibly hard, even though it turned out to be only ~100 words. * **Life**: If it makes a good story, it was worth it. * **Religion**: I am fairly certain life is a "The Sims" game and God got bored a long time ago. * ***Politics****:* If I could pass one law, it would be to make it illegal for any politician to mention religion. * ***The Secret of Life*** *:* Have fun first and if there is time, leave the world a better place than you found it. There will be time. * **Education**: The best weapon we have to fight ignorance and intolerance is education. You can’t change the current generation, but you can steal the future right out from under them. * **Food**: I am on top of the food chain, so I’ll act that way. * **Piracy**: People pirate because the content was too hard to obtain & use legally, not because I want the artist to starve. * **Media Distribution**: Most people will choose access over quality. * **Ownership**: Why do I need to own it if I have access to it? * **Gamification:** Gamification done poorly is obvious, Gamification done well is simply a game. * **Games**: Games are neither innately good nor bad. They can be used to train killers or teach compassion. Lets make them awesome. I like this. I think I’ll put this on my about page. *Originally published at 21 Jan 2012* --- --- url: /blog/post/2017-06-09-elasticdump-maintinaer.md description: >- There often comes a time in an open-source project’s lifecycle when the original maintainer needs to move on. Today, that person is me, and… --- ![](/images/medium-export/1__OopZJXTgMJhT1S0pBaGapQ.jpeg) There often comes a time in an open-source project’s lifecycle when the original maintainer needs to move on. Today, that person is me, and the project is [ElasticDump](https://github.com/taskrabbit/elasticsearch-dump)… Personally, I no longer use Elasticsearch in my day-to-day activities, and I’ve moved on from [TaskRabbit](https://www.taskrabbit.com), the wonderful company that sponsored this project in the first place. Since then, I’ve been maintaining this project *passively* for over a year, but now I find that I don’t have the time to give it the focus it requires. I would hate to see a project used by so many people rot away without active support. We’ve got over 2,000 GitHub stars and over 20,000 downloads ***a month*** ([source](https://www.npmjs.com/package/elasticdump)). I would love to see a member of the community step up and take over the stewardship of this project. There are so many places Elasticdump can go in the future, and that direction is up to you! Here are a few ideas: * More import/export formats, like JSON, CSV, BSON, GZIP, and more! * Better integration with AWS (as many of the features added in the past year work with AWS authentication directly) * Better ‘resume’ features, to start where you left off should your dump become interrupted * Smarter parallelization * Better limit/search/offset tools * And… whatever you can think of! I guess there are some bugs and compatibility issues to fix as well… We’ve also got a fairly popular Docker image (with over 35K pulls; [source](https://hub.docker.com/v2/repositories/taskrabbit/elasticsearch-dump/)), and distribution on a few linux package managers. I’m listing all of these to point out that if you are looking for a way to help make a difference in the Elasticsearch ecosystem, perhaps taking over stewardship of this project is for you! **If you are interested in helping to maintain this open-source ElasticSearch project,** [**please add your name on the related GitHub issue**](https://github.com/taskrabbit/elasticsearch-dump/issues/333). --- --- url: /blog/post/2013-01-29-makara.md description: A read-write splitting adapter for Active Record --- ![](/images/medium-export/0__PQJXKJGTtmUqMwGe.jpg) [Makara is a ruby gem](https://github.com/taskrabbit/makara) which allows your Rails 3.x application to split its database queries based on their contents. The features of Makara include: * Read/Write splitting across multiple databases * Failover om slave errors/loss * Automatic reconnection attempts to lost slaves * Optional "sticky" connections to master and slaves * Works with many database types (mysql, postgres, etc) * Provides a middleware for releasing stuck connections * Weighted connection pooling for slave priority ### Development History The main [www.taskrabbit.com](http://www.taskrabbit.com) site is a Rails3 application. As we have grown, we have learned a few things about our load profile: * We have far more reads than writes (over 20x) * As we have grown, the ratio of reads to writes has remained consistent * As most of our traffic is read-heavy, it can be a few seconds out of date, but there are some pages/use-cases which require up-to-date information. This includes pages you have just created (Task posting) or content you just edited (profile updates) * There are only so many interactions any database can handle before it gets SLOW As TaskRabbit grew, we quickly realized the need to scale our database tier. We don’t yet have the volume of data which would require [traditional sharding](http://stackoverflow.com/questions/1610887/how-to-partition-mysql-across-multiple-servers) (and it’s always nicer to have all of your data in one place for analysis if you can), so we wanted to approach scaling our database tier from a replication point-of-view. As noted above, we aren’t write-heavy, especially since we make use of many temporary stores (like on-disk, memcache, riak, and redis), so the added complexity of master-master replication didn’t seem worth the hassle either. We also have a new bus system current being phased in which will further limit writes. This left traditional master-slave replication and scaling. ![](/images/medium-export/0__nIlKPUsEErQOUbUW.jpg) By default, the mysql and [mysql2](https://github.com/brianmario/mysql2) adaptors for Ruby don’t have any support for more than one database, so we went exploring for other options. Our first stop was the [(SoundCloud-specific fork) of the master\_slave\_adaptor](https://github.com/soundcloud/master_slave_adapter). We used this in production for some time, but we eventually learned that it had a few bugs regarding the way it checked if a slave was up-to-date, and the majority of the time we ended up reading from our master database. Next we moved on to the Octopus gem ([here’s our fork of it](https://github.com/taskrabbit/octopus/)). While this gem did allow us to do master-slave splitting, it didn’t handle errors so well. In fact, if any of your slaves went down or timed out, the error bubbled up to your application and presented as a normal database error. While we didn’t solve that within Octopus, we at least were able to introduce the notion of a blacklist (i.e., slaves that went bad) and didn’t use them for subsequent requests after the first user saw the 500 error. After some amount of time had passed, we would check the slave again to see if it came back. After using Octopus for a while we noticed that, in some rare cases, our web app could be faster than our database replication. For example, if you just posted a Task, the next page we rendered for you is the public Task page so you can confirm that everything looks as you expected. At this point, the INSERT statement that just ran on the master database may not finsihed replicating to the SLAVE(S).If you then query the slave there’s a chance you’ll get a RecordNotFound error. It was this type of error that prompted us to develop the ‘sticky’ notion of choosing a database. In a nutshell, once you have modified a record (INSERT, DELETE, or UPDATE), you should continue to use whichever database you performed that action on for the remainder of your request. This ensures that the data you are using is consistent throughout the request. This notion of sticking was also very important in our Delayed Job workers, which often performed more requests faster than our web servers. Keeping a consistent database is also important when traversing any belongs\_to relationships for obvious reasons. Unfortunately, our logic for this while using the Octopus gem was fairly hacky: ```ruby ActiveRecord::ConnectionAdapters::AbstractAdapter.class_eval do attr_reader :last_query def log_with_last_query(sql, name, &block) @last_query = [sql, name] Octopus::Proxy.master_lock?(sql) log_without_last_query(sql, name, &block) end alias_method_chain :log, :last_query end ``` You will notice that because the Octopus gem didn’t expose the actual query Active Record created, we hijacked the logger to get the final query. Octopus made its choices of which type of database to use by inspecting the method called from Active Record rather than query inspection. The way we handled "un-sticking" from a database was to reset it after the request ended in our Unicorn configuration. We had some trouble upgrading from Rails 2 to Rails 3 with Octopus so we thought it was time to write our own solution. ### Features First and foremost, Makara needed to be able to allow us to scale our database capacity in a way that didn’t require us to change any code within our application (other than a database.yml update). We needed the application to be able to perform on a development laptop running vanilla mySQL and in production with n-geography-specific replication shards. The gem also needed to be able to handle the assignment of roles to these databases. Thus, [the structure of our ideal database.yml was born](https://github.com/taskrabbit/makara/blob/master/database.example.yml): ```yaml production: sticky_slave: true sticky_master: true adapter: makara db_adapter: mysql2 host: xxx user: xxx password: xxx blacklist_duration: 5 databases: - name: master role: master - name: slave1 role: slave host: xxx user: xxx password: xxx weight: 3 - name: slave2 role: slave host: xxx user: xxx weight: 2 ``` You will note that you can define "common" connection parameters (like a database name), and overwrite them or provide specifics for each replica (for example in production we have a read+write user for master, and a read+only user for the slaves). The database.yml is structured the same as the underlying connection infrastructure. We have one top-level makara adapter which serves a single purpose: delegating the execution of sql to the best underlying adapter. The underlying adapters are your standard adapters (mysql2, sqlite3, postgresql, etc) with some ruby magic sprinkled on top. The ruby magic isn’t *that* magical, it’s merely some [instance extending](https://github.com/taskrabbit/makara/blob/master/lib/active_record/connection_adapters/makara_adapter.rb) which [overrides the execute()](https://github.com/taskrabbit/makara/blob/master/lib/makara/connection/decorator.rb) method, giving the top-level makara adapter the chance to re-route the execution. On to blacklisting. Because we have inserted ourselves as the Active Record adaptor directly, we have the luxury of actually catching errors from the "real" database adaptors we hold connections to. This made the creation of the blacklist a lot simpler. We can simply hold an array of all the connection pools, and choose which types of errors to catch, and which types to pass though. This also allows us to retry a failed query before passing the results back up to the rest of the Rails stack. Did your read from SLAVE2 just fail because it is under heavy load? It’s cool, lets try it on SLAVE1 and then MASTER. Now if your master databases fails, you have problems, but no gem can save you from that :D, unless you are running in master-master mode, then we have your back again. There are types of errors (like duplicate key warnings etc) which you do want to bubble up, so [we have methods to pass those back to the stack.](https://github.com/taskrabbit/makara/blob/master/lib/makara/connection/error_handler.rb) We made the choice to be a Rails-only gem, that allowed us to make use of a middleware that helped us enforce our ‘stickiness’ across requests. We ensured that only one slave was used for all queries in a request by default, adn f your wrote to master, you stayed there for the subsequent request. What about the case where you create a new record in one request, and then instantly want to view it in the next? How can you be sure that whichever slave you hit next has the data you need? With a cookie! We use the rails middleware to drop down a cookie if you have been stuck to master, and on the next request, Makara will ensure that you come back to that same database, just for that next request. #### [Makara is available now. Enjoy!](https://github.com/taskrabbit/makara) --- --- url: /blog/post/2019-12-02-markdown-in-react-and-custom-page-elements.md description: >- I recently moved the Actionhero tutorials from the Actionhero Docs site docs.actionherojs.com to the main Actionhero website… --- ![](/images/medium-export/1__iW7bsn__oLc__LD9HWaD__xbw.jpeg) I recently moved the Actionhero tutorials from the Actionhero Docs site [docs.actionherojs.com](http://docs.actionherojs.com) to the main Actionhero website [www.actionherojs.com.](http://www.actionherojs.com]%28www.actionherojs.com%29.) We are switching Actionhero from Javascript to Typescript, and as such we’ve changed from using JSDoc to TypeDoc to generate our documentation site. Previously, we had a custom "theme" for JSdoc which included our Tutorials within the docs, but that was a bit of a hack. To me, there’s a distinction between \`tutorials\` and \`docs\`, and having both in the same place could lead to confusion. This was a great time to make the switch. ### Why Separate Docs from Tutorials? I think to have a well-documented project you need both of these components — Docs and Tutorials, but they aren’t consumed by the same audience in the same way. * **Tutorials/Guides** — These are narrative descriptions of how you might use a feature. It walks through the steps linearly form A to B to C, and when you are done, you have a working thing. These are often geared towards new users to the product or tool. * **Docs** — Docs are API reference guides, method signatures, and generally other hints to how to implement something technically once you understand how & why you might use it. I often reference this wonderful guide by Divio talking about the different types of documentation: . You should read it if you aren’t familiar with the "Cooking" metaphor for documentation. **Markdown in your HTML** ![](/images/medium-export/1__ZKpQLEHWKug23MjlpbmN4Q.png) It was very pleasant to write Actionhero’s tutorials in Markdown. It makes focusing on the content rather than the style very simple, while abstracting away all the DIVs and TAGs of HTML. It also makes it easy to Diff changes when updating the site (i.e. when looking at a Pull Request). With the goal of keeping this part of the site in Markdown, we needed to find a way to render it React. The [React Markdown](https://github.com/rexxars/react-markdown) package is wonderful at this step. You can load in a Markdown file and React Markdown with generate the HTML. A few tips: * We use Next.js. The way that Next.js handles hydration of pages from the server to the client wants to pass DATA and not HTML. This means that if were to render the markdown content on the server when doing a hot-reload of the page (i.e. navigation form another page to this page), the markdown HTML would not properly render. That’s why we parse the markdown at the `componentDidMount` stage of the lifecycle. This may have adverse effects on the SEO of those pages. * You can load the markdown file into your app as a Prop derived via `getInitialProps`! This means that the markdown content will be passed down from the server on initial page load. ```js export default class ToutorialPage extends Component { static async getInitialProps(ctx) { const name = ctx.query.name; const markdown = await require(`./../../tutorials/${name}.md`); return { markdown: markdown.default, name, }; } render() { return ( ); } } ``` ### Hooking into Rendering to modify State In the example above you can see that ***react-markdown*** lets us provide special renderers for each HTML element. 2 things that were important to this project were rendering code properly, and adding sub-navigation to each page. Adding code was easy, as we already had a component for rendering code based on [react-syntax-highlighter](https://github.com/conorhastings/react-syntax-highlighter). ```js import { Component } from "react"; import SyntaxHighlighter from "react-syntax-highlighter"; import { docco } from "react-syntax-highlighter/dist/cjs/styles/hljs"; interface Props { language?: string; showLineNumbers?: boolean; value?: string; } export default class extends Component { render() { const language = this.props.language || "typescript"; const showLineNumbers = this.props.showLineNumbers || false; return ( {this.props.value ? this.props.value : this.props.children} ); } } ``` We just pass that component into our example above: ```js import Code from "./../../components/code"; export default class ToutorialPage extends Component { static async getInitialProps(ctx) { const name = ctx.query.name; const markdown = await require(`./../../tutorials/${name}.md`); return { markdown: markdown.default, name, }; } render() { return ( ); } } ``` Adding navigation was a bit tricker. We accomplished this by creating a custom renderer for Headers that also built up a list of all the section headers into the page’s `state` with this new `parseHeading` method: ```js import Code from "./../../components/code"; export default class ToutorialPage extends Component { static async getInitialProps(ctx) { const name = ctx.query.name; const markdown = await require(`./../../tutorials/${name}.md`); return { markdown: markdown.default, name, }; } render() { return ( ); } } ``` `this.state.sectionHeadings` is built in our render as we parse the headers. We then have this available to the rest of the page to draw our side navigation! Notes: * Since we are changing `state` within the render method, it’s easy to get into an infinite loop. That’s why we need to only modify the list of headers (`sectionHeadings`) if the header isn’t present. * Since we have access to the header’s render method now, we add more style! Here we are adding our custom RedLine component to draw a line under the header of each section ![](/images/medium-export/1__dvdjutLY__deRPGfL__FU44Q.png) * In the final version of the page’s source (which you can see [here](https://github.com/actionhero/www.actionherojs.com/blob/master/pages/tutorials/%5Bname%5D.tsx)) you can see that we do even more in the header’s render message, link changing colors if the section is in view, highlighting things, etc. It’s very powerful! You can read more about Actionhero’s move to Typescript in the new \`Typescript\` Tutorial here: (yes, it’s written in markdown)! --- --- url: /blog/post/2011-12-01-movember.md description: Beard --- For the month of November, ModCloth men have been participating in Movember [as explained on blog.modcloth.com](http://blog.modcloth.com/2011/11/11/modcloth%E2%80%99s-men-of-movember/). **I won the contest:** ![](/images/medium-export/1__IuJs3BGPSPBA4E__PDKcWLQ.jpeg) --- --- url: /blog/post/2014-12-07-node-for-not-http.md description: >- On Thursday 2014–12–04 I gave a talk as the SF Node.js Club entitled "Node for Not HTTP". --- On Thursday 2014–12–04 I gave a talk as the [SF Node.js Club](http://www.meetup.com/Node-js-Serverside-Javascripters-Club-SF/events/209529622) entitled "Node for Not HTTP". ![](/images/medium-export/0__GCSqPEgHzU4Qd__PH.jpeg) The talk gave an overview of some of the cool things you can do with node.js that are not related to websites. To do this, I cobbled together an [ActionHero](http://www.actionherojs.com) project that would take input form HTTP, WebSockets, Telnet, and twitter which allowed us to turn a desk lamp on and off. I taught a room about the [DMX lighting protocol](https://github.com/evantahler/node_for_not_http/blob/master/dmx.js) and how to build an ActionHero server in 20 minutes! * [**Slides**](https://docs.google.com/presentation/d/1jijGOARfeMDSqZnl52uv3N9zL8K4Eiu4g6wGTtUIll4) * [**Github Project**](https://github.com/evantahler/node_for_not_http) * [**Video**](http://strongloop.com/node-js/videos/#An-Introduction-to-Non-HTTP-Clients-for-Node.js) [**Node for Not HTTP** \_For a better viewing and editing experience, install the free app.\_docs.google.com](https://docs.google.com/presentation/d/1jijGOARfeMDSqZnl52uv3N9zL8K4Eiu4g6wGTtUIll4/edit "https://docs.google.com/presentation/d/1jijGOARfeMDSqZnl52uv3N9zL8K4Eiu4g6wGTtUIll4/edit")[](https://docs.google.com/presentation/d/1jijGOARfeMDSqZnl52uv3N9zL8K4Eiu4g6wGTtUIll4/edit) [**evantahler/node\_for\_not\_http**](https://github.com/evantahler/node_for_not_http) --- --- url: /blog/post/2011-12-01-node-spider.md description: Sometimes nodeJS amazes me. --- ![](/images/medium-export/1__rso6t9yfQIcm9iMis4yiVw.jpeg) Sometimes [nodeJS](http://nodejs.org/) amazes me. At [Evil Genius Designs](http://evilgeniusdesigns.com) we spent a lot of time coming up with a way to keep our application servers in sync, in real-time. One of the core pieces of software we built (in PHP mind you) was a system that dealt with SMS and Voice phone calls to play games with. SMS messages are transactional in the traditional sense. You can treat them like a web request coming in, store them to a database, and process them at some frequency. You can think of an SMS-based game as being just like processing tweets or emails. At some frequency you poll the database or API and ask "what’s new since the last time I asked?" and then process the results. SMS as a protocol is slow, and so games based around SMS have to assume there will be 5–10 second latency between the player sending the message from their phone and the game receiving the input. Because of this, the SMS backend can be scaled just like a normal web application (read/write replicas of a DB, multiple nodes to handle requests, shared session storage, etc). However, phone calls interactions need to be fast. For a game to feel right, I need to be able to press the number key on my phone and within 100–300 ms see my input reflected on the screen. We also knew that our system had to scale to handle 100K+ callers at once, and that wasn’t going to happen on a single server. We needed true synchronous message passing between our nodes which bypassed the database as a common datastore. We needed a game (which was connected to node 1) to be passed the phone input from Player 1 (connected to node 2) which signaled a message to be played back to Player 2 (connected to node 3). To handle this, we invented what we called the Spider. The job of the Spider was take message from node A and pass it to all other nodes it new about, and do this FAST. The Spider was the only application we had to write in a language other than PHP. We chose C++ because we thought we needed threading (one per connected nodes) and we needed robust port management. What seemed like a relatively simple routing task took over a month. There were timing issues, thread locking issues, and all hell broke loose if the spider lost the connection to PHP socket-server application which was handling the actual user connections on each node. Eventually we got it working, and as far as I know, the Spider is still keeping everyone in sync today. So what does this have to do with nodeJS? Well, I found myself in a very similar situation recently. I have a framework ([PHP version](https://github.com/evantahler/PHP-DAVE-API), [nodeJS version](https://github.com/evantahler/nodeDaveAPI)) which I am using as the backend to a game. I want the API for the game to support both traditional HTTP-based connections and persistent socket clients. Just like before, the HTTP actions can be handled in the normal web way with the application stack asking the DB for information and passing it back, but the socket connections demand more real-time information as it comes from either the other users or the game itself. I was going to need multiple nodes. I need to keep them in sync. So I thought I would give building the spider in nodeJS a try. Only 4 hours later, I not only had recreated all the functionality of the original Spider, but I also added support for fancy logging, "chat rooms", and a keep-alive manager. Win. [Here it is, in all its open-source glory for you to use and enjoy](https://github.com/evantahler/nodeSpider). **tldr**; Real time communication in node is easy. You can use node to keep many other apps in sync. [Here’s a demo project for you to play with and extend](https://github.com/evantahler/nodeSpider) --- --- url: /blog/post/2012-01-16-running-nodejs-on-sbc2-phidget-board.md description: >- After almost 15 hours or compile-try-fail-repeat, I’ve figured out the formula for compiling nodeJS on a Phidget SBC2 board! --- ![](/images/medium-export/1__nsVkXw9p4jX04PKNRCsUdA.jpeg) After almost 15 hours or compile-try-fail-repeat, I’ve figured out the formula for compiling [nodeJS](http://nodejs.org/) on a [Phidget SBC2 board](http://www.phidgets.com/products.php?category=0\&product_id=1072_0)! This combination of Node and Phidgets creates what is, in my opinion, the best sensor prototyping platform for under $250. This makes use of [my previously mentioned NPM package for connecting node.js to phidgets](http://blog.evantahler.com/on-nodejs-and-phidgets). This was possible only with the help of the friendly people at Phidget, Github, and numerous sites stumbled upon via Google. Get Hacking!!! I made a small nodeJS [app which reads in the temperature of my house every minute and Tweets it](https://twitter.com/#!/phidgetnode)… Because I can. * One of the harder steps of this process was cross-compiling V8. I’ve saved off my v8 compiled for Phidget SBC2 (armv4tl compatible) for you to skip a few steps in the process (and so you don’t need to also install Debian somewhere) (no longer works :( ) * Detailed steps of this process are copied below, [but also here on a GitHub gist](https://gist.github.com/1574158) * A lot of people helped me out, specifically [these](https://github.com/joyent/node/issues/2131) [folks](http://code.google.com/p/v8/wiki/CrossCompilingForARM) I [am](http://forum.qnap.com/viewtopic.php?p=242405) [linking](https://github.com/joyent/node/issues/2131#issuecomment-3499634) [here](https://github.com/bnoordhuis). Node is awesome, Phidgets are awesome. Syngergy. *All I wanted to do was tweet the temperature of my house automatically…* ### What you will need: * A computer (or a virtual machine) running a full version of the Debian operating system * I used Debian 6.0.3, 64bit * A 1GB (or more) USB memory stick * The [phidgetsbc2](http://www.phidgets.com/products.php?category=0\&productid=10720) doesn’t have enough ram to compile node, so we will be using this memory stick as swap space… which is likely to destroy the memory stick * Internet connectivity for both your Debian computer and the Phidget board ### Cross-Compile V8 Locally The v8 stack simply won’t compile on the fidget board. I think that it has to do with floating point precision, but I can’t be sure. Either way, we are going to compile an ARM binary on our "big" Debian computer and copy it over: #### **Get codesourcery** SSH to your Debian machine, and su root ```bash sudo mkdir /opt/codesourcery cd /opt/codesourcery wget http://www.codesourcery.com/sgpp/lite/arm/portal/package4571/public/arm-none-linux-gnueabi/arm-2009q1-203-arm-none-linux-gnueabi-i686-pc-linux-gnu.tar.bz2 tar -xvf arm-2009q1-203-arm-none-linux-gnueabi-i686-pc-linux-gnu.tar.bz2z2 ``` #### Get Node and build it for the Phidget board ```bash wget http://nodejs.org/dist/v0.6.7/node-v0.6.7.tar.gz tar -xvf node-v0.6.7.tar.gz cd node-v0.6.7/deps/v8 export TOOL_PREFIX=/opt/codesourcery/arm-2009q1/bin/arm-none-linux-gnueabi export CXX=$TOOL_PREFIX-g++ export AR=$TOOL_PREFIX-ar export RANLIB=$TOOL_PREFIX-ranlib export CC=$TOOL_PREFIX-gcc export LD=$TOOL_PREFIX-ld export CCFLAGS="-march=armv4 -mno-thumb-interwork" OR>> export CCFLAGS="-march=armv4 -mno-thumb-interwork -mtune=xscale -mno-thumb -mfloat-abi=soft -mfpu=maverick" export ARM_TARGET_LIB=/opt/codesourcery/arm-2009q1/arm-none-linux-gnueabi/libc scons armeabi=soft wordsize=32 snapshot=off arch=arm library=shared mode=release scons armeabi=soft wordsize=32 snapshot=off arch=arm library=shared mode=release sample=shell ``` #### **Copy the entire v8 directory to the memory stick** I was running Debian in a virtual machine on my OSX machine, so I rSync’ed it ```bash rsync -avz root@{remote_host_ip}:/root/node-v0.6.7/deps/v8 /Volumes/{memory_stick}/node ``` ### Update the Phidget Board #### **New Firmware** * (USB Key needed) * Copy to the USB stick and follow the instructions in the web console #### **Config (via web interface)** * turn on ssh { } * Install C++ Develeper Headers { } * Include full Debian Package Repository { } #### **Local Configuration (via SSH) on the Phidget board** ```bash ssh root@phidgetsbc.local apt-get update apt-get -u upgrade apt-get install gcc wget python openssl make scons libssl-dev libax25 libfile-copy-recursive-perl openbsd-inetd tcpd update-inetd python-software-properties pkg-config htop git subversion ``` ### Copy over and configure V8 Plug in the USB drive. I kept the now-compiled V8 source in /node/v8 on the memory stick ```bash export PATH=$PATH:/opt/bin echo "/opt/lib" >> /etc/ld.so.conf ldconfig mkdir /opt/share/v8 cp -a /media/{usb_stick_usb_path}/node/v8 /opt/share/. echo "/opt/share/v8" >> /etc/ld.so.conf ldconfig ``` ### Add more RAM This is likely to destroy the memory stick after a lot of use (USB hates random I/O). Create a swap file and configure it (will take ~10 min) ```bash dd if=/dev/zero of=/media/usb0/swapfile bs=1M count=256 mkswap /media/usb0/swapfile swapon /media/usb0/swapfile ``` ### Node.js ```bash export JOBS=1 export CC='gcc -march=armv4 -mfloat-abi=soft' export CCFLAGS='-march=armv4 -mfloat-abi=soft' export CXX='g++ -march=armv4 -mfloat-abi=soft' export GCC='-march=armv4 -mfloat-abi=soft' wget http://nodejs.org/dist/v0.6.7/node-v0.6.7.tar.gz tar -xvf node-v0.6.7.tar.gz rm node-v0.6.7.tar.gz cd node-v0.6.7 ./configure --shared-v8 --shared-v8-libpath=/opt/share/v8 --shared-v8-includes=/opt/share/v8/include --without-snapshot ## If the configuration isn't all green, something is wrong make make install ``` Note: For me, a few times various parts of the 35 steps make preforms will crash with a segmentation fault. I guess this has to do with ram? Make will resume where you left off last, so just run it again ### NPM ```bash curl [http://npmjs.org/install.sh](http://npmjs.org/install.sh) | sh ``` ### Contributors to this guide: * * * * --- --- url: /blog/post/2011-12-11-node-checker.md description: >- At ModCloth, our Ops team wanted to have a real time dashboard of important site information so they could know ASAP if something was wrong… --- ![](/images/medium-export/1__1fdk42KmQdR6Swb1gDjkcw.png) At [ModCloth](http://modcloth.com), our Ops team wanted to have a real time dashboard of important site information so they could know ASAP if something was wrong with the site. We utilize a number of third party customer tracking tools with excellent dashboards (Google Analytics, Omniture), but they often had a 1 hour or more lag. We also have a number of back-end monitoring tools (AirBrake/HopToad, New Relic), and while these tools are (almost) real time, they don’t have a view into customer behavior. We even have analytic tools which can visualize data from our production database or data warehouse, but these tools are more for data mining than visualization. So, I decided to build one. Inspired by the [panic status board](http://www.panic.com/blog/2010/03/the-panic-status-board/) and some other big boards I’ve seen, this project is meant to be a simple way for you to monitor anything you want. It’s a small nodeJS project built on the actionHero framework (more coming soon) that uses simple config files to allow you to create real time charts of your important data. Here’s a screen shot: Configuring checks is as simple as making a new entry in `checks.json`: ```json { "name": "http_google_com", "type": "httpRequest", "frequencyInSeconds": 10, "entriesToKeep": 100, "params": { "hostname": "http://www.google.com", "matcher": "</div>" } } ``` This check would do a web request to google.com every 10 seconds, and then parse the response for the string \. The chart will graph the time this operation took, and the pie graph will show how many times the check was successful (success in this case means having a complete response with the string \ found). Because this is *exactly* the type of operation node is great at (handling routing while grabbing data from other sources), I also made it easy to add new "checks" to the application. A new proxy checker which would generate a random number on each cycle would be: ```js var checker = {}; checker.name = "randomNumber"; checker.params = { required: [], optional: [], }; checker.check = function (api, params, next) { var response = {}; response.error = false; response.check = false; var number = Math.random() * 100; response.number = number; response.check = true; next(response); }; exports.checker = checker; ``` Hopefully this is simple as well! [Check out the project, and have fun!](https://github.com/evantahler/nodeChecker) --- --- url: /blog/post/2012-01-30-nodechecker-update-now-with-ssh-and-sockets.md description: >- The nodeCheker project which I spoke about a few months ago got an update today. Now you can execute arbitrary SSH commands to other… --- ![](/images/medium-export/1__wRHkZEzat__8rR47SSAMKeA.jpeg) The [nodeCheker](https://github.com/evantahler/nodeChecker) project which I [spoke about a few months ago](http://blog.evantahler.com/nodechecker-big-board-dashboarding) got an update today. Now you can execute arbitrary SSH commands to other servers (like checking load, disk space, etc). You can then parse the returned strings via regex to extract a graph-able quantity. For example. here is my script to monitor the used disk space on the [ActionHero demo server](http://actionhero.evantahler.com/): ```json { "name": "disk_space_on_actionHero_demo_server", "type": "ssh", "frequencyInSeconds": 10, "entriesToKeep": 100, "params": { "hostname": "actionhero.evantahler.com", "user": "userNameHere", "command": "df", "sshKey": "/path/to/your/file.pem", "regex": "/dev/xvda1\\s*\\d*\\s*\\d*\\s*\\d*\\s*(...)%" } } ``` which parses a response like: ```text Filesystem 1K-blocks Used Available Use% Mounted on /dev/xvda1 8256952 1407440 6765640 18% / tmpfs 305624 0 305624 0% /dev/shm ``` and would return **18** as the quantity I was looking for. The goal of this project is to create a simple tool for ops-types to monitor their applications, and SSH checking was a oft-requested feature. Huzzah! I’ve also updated the project to push messages to connected SSH users by room. You can be in the "all" room to get all the check results, or a room named by the name of the [checkout the project to learn more](https://github.com/evantahler/nodeChecker). [**evantahler/nodeChecker**](https://github.com/evantahler/nodeChecker) --- --- url: /blog/post/2012-02-13-node-spider-now-on-actionhero.md description: >- A quick update about the nodeSpider project (my first public node.js project!): It is now based on ActionHero. --- ![](/images/medium-export/1__n__p2AQEh65jifa73Si__btQ.jpeg) A quick update about the nodeSpider project (my first public node.js project!): It is now based on [ActionHero](http://actionherojs.com). I [announced the nodeSpider](http://blog.evantahler.com/node-spider-writing-a-real-time-messaging-relay-in-nodejs) project here, and it was what first "sold" me on node.js as a framework. Without going into all of the details of the project, I can simply say that what used to be a very hard sync problem in C++ is now easy to do in node. Since I completed [version 1 of the project](https://github.com/evantahler/nodeSpider/tree/04c1b1fff932bd5bcff8dc78e0d0fa9d32f259fb), I have gone on to create actionHero which used many of the ideas from nodeSpider. Now the project has come full-circle and now implements actionHero directly. This is a significant update, as the syntax of the outputs has been updated to match actionHero, and hopefully this makes the responses more intelligible :D this project will now benefit from the normal bug-fix pipeline of actionHero. Here’s a new conversation between 2 peers (and the server log it generated) as an example of the updated syntax: ### Client 1 ```text > telnet localhost 5555 Trying 127.0.0.1... Connected to localhost. Escape character is '^\]'. {"welcome":"Welcome to the Node Spider communication server.","room":"defaultRoom","context":"api","messageCount":0} say hello from client 1 {"context":"response","status":"OK","messageCount":1} {"message":"hi! from client 2","from":"ccc158b6b5ab19cff3eca71a876f83fc","context":"user","messageCount":2} roomView {"context":"response","status":"OK","room":"defaultRoom","roomStatus":{"members":\[{"id":"18c3ab44cb093ba3a400aab48fafcdbe"},{"id":"ccc158b6b5ab19cff3eca71a876f83fc"}\],"membersCount":2},"messageCount":3} roomView {"context":"response","status":"OK","room":"defaultRoom","roomStatus":{"members":\[{"id":"18c3ab44cb093ba3a400aab48fafcdbe"}\],"membersCount":1},"messageCount":4} roomChange secretRoom {"context":"response","status":"OK","room":"secretRoom","messageCount":5} {"message":"still talking in secret room","from":"ccc158b6b5ab19cff3eca71a876f83fc","context":"user","messageCount":6} {"context":"api","status":"keep-alive","serverTime":"2012-02-13T01:18:07.185Z","messageCount":7} quit {"status":"Bye!","messageCount":8} Connection closed by foreign host. ``` ### Client 2 ```text > telnet localhost 5555 Trying 127.0.0.1... Connected to localhost. Escape character is '^\]'. {"welcome":"Welcome to the Node Spider communication server.","room":"defaultRoom","context":"api","messageCount":0} {"message":"hello from client 1","from":"18c3ab44cb093ba3a400aab48fafcdbe","context":"user","messageCount":1} hi! from client 2 {"context":"response","messageCount":2} roomChange secretRoom {"context":"response","status":"OK","room":"secretRoom","messageCount":3} talking in secret room {"context":"response","messageCount":4} still talking in secret room {"context":"response","messageCount":5} {"context":"api","status":"keep-alive","serverTime":"2012-02-13T01:18:07.185Z","messageCount":6} quit {"status":"Bye!","messageCount":7} Connection closed by foreign host. ``` ### Server Log: ```text $ npm start > spider@2.0.0 start /Users/evantahler/PROJECTS/nodeSpider > node ./spider.js 2012-02-12 17:17:07 | no ./tasks.js file in project, loading defaults tasks from /Users/evantahler/PROJECTS/nodeSpider/node_modules/actionHero/tasks.js 2012-02-12 17:17:07 | periodic (internal cron) interval set to process evey 60000ms 2012-02-12 17:17:07 | data cache from backup file. 2012-02-12 17:17:07 | \*\*\* Server Started @ 2012-02-12 17:17:07 @ web port 8080 & socket port 5555 \*\*\* 2012-02-12 17:17:07 | Boot Sucessful! 2012-02-12 17:17:08 | socket connection 127.0.0.1 | connected 2012-02-12 17:17:09 | socket connection 127.0.0.1 | connected 2012-02-12 17:17:13 | > socket request from 127.0.0.1 | say hello from client 1 2012-02-12 17:17:19 | > socket request from 127.0.0.1 | hi! from client 2 2012-02-12 17:17:19 | action @ 127.0.0.1 | params: {"action":"hi!"} 2012-02-12 17:17:24 | > socket request from 127.0.0.1 | roomView 2012-02-12 17:17:32 | > socket request from 127.0.0.1 | roomChange secretRoom 2012-02-12 17:17:35 | > socket request from 127.0.0.1 | roomView 2012-02-12 17:17:44 | > socket request from 127.0.0.1 | talking in secret room 2012-02-12 17:17:44 | action @ 127.0.0.1 | params: {"action":"talking"} 2012-02-12 17:17:50 | > socket request from 127.0.0.1 | roomChange secretRoom 2012-02-12 17:17:56 | > socket request from 127.0.0.1 | still talking in secret room 2012-02-12 17:17:56 | action @ 127.0.0.1 | params: {"action":"still"} 2012-02-12 17:18:07 | \* periodic cron tasks starting now \* 2012-02-12 17:18:07 | starging task: Clean cache object 2012-02-12 17:18:07 | starging task: Clean Log Files 2012-02-12 17:18:07 | starging task: caclculateStats 2012-02-12 17:18:07 | starging task: saveCacheToDisk 2012-02-12 17:18:07 | starging task: pingSocketClients 2012-02-12 17:18:07 | completed task: Clean cache object 2012-02-12 17:18:07 | completed task: Clean Log Files 2012-02-12 17:18:07 | completed task: saveCacheToDisk 2012-02-12 17:18:07 | >> pingSocketClients | sent keepAlive to 2 socket clients 2012-02-12 17:18:07 | completed task: pingSocketClients 2012-02-12 17:18:07 | completed task: caclculateStats 2012-02-12 17:18:07 | \* periodic cron tasks comple. see you again in 60000ms \* 2012-02-12 17:18:11 | > socket request from 127.0.0.1 | requesting disconnect 2012-02-12 17:18:11 | > socket connection 127.0.0.1 disconnected 2012-02-12 17:18:12 | > socket request from 127.0.0.1 | requesting disconnect 2012-02-12 17:18:12 | > socket connection 127.0.0.1 disconnected ``` To implement nodeSpider from actionHero, the only significant pice of code replacing the buil-in api.processAction() method with one that would assume an "unknown" action was a "say" command and should do that instead of throwing an error. Conveniently, there is already a method to update/add api methods as part of the actionHero initializer. Here’s how I did it: ```js var actionHero = require("actionHero").actionHero; var params = {}; params.initFunction = function (api, next) { // update process action to make all sent strings a "say" by default api.processAction = function (api, connection, next) { if (api.configData.logRequests) { api.log( "action @ " + connection.remoteIP + " | params: " + JSON.stringify(connection.params), ); } if (connection.error === false) { connection.action = connection.params["action"]; if (api.actions[connection.action] != undefined) { api.utils.requiredParamChecker( api, connection, api.actions[connection.action].inputs.required, ); if (connection.error == false) { process.nextTick(function () { api.actions[connection.action].run(api, connection, next); }); } else { process.nextTick(function () { next(connection, true); }); } } else { if (connection.action == "" || connection.action == null) { connection.action = "say"; } api.socketServer.socketRoomBroadcast( api, connection, connection.lastLine, ); process.nextTick(function () { next(connection, true); }); } } else { process.nextTick(function () { next(connection, true); }); } }; next(); }; actionHero.start(params, function (api) { api.webServer.webApp.close(); // turn off the webserver api.log("Boot Sucessful!"); }); ``` --- --- url: /blog/post/2012-08-12-npm-and-generators.md description: >- Today I learned that npm (the node.js package manager) has support for arbitrary commands and chained commands. I guess I always knew this… --- ![](/images/medium-export/1__sa2zvb4ADQo4G__42OfSjfQ.jpeg) Today I learned that [npm](http://npmjs.com/) (the node.js package manager) has support for arbitrary commands and chained commands. I guess I always knew this (how else can you trigger make && make install after download automatically?), but I finally looked into it. Now that I’m aware of this voodoo, I was able to create a generator for actionHero. If you are familiar with rails, you are familiar all the scaffolding rails has (including the creation of a new project). actionHero isn’t nearly as complex as a modern rails project, but I really like the notion of a single command that sets up a new project for you. Here’s actionHero’s version: npm install actionHero; npm run-script actionHero generate; npm start This will get you started with some basic actions and a task, as well as launching the server locally for http, https, web sockets, and tcp. ![](/images/medium-export/0__u7yDKjGyDN8bg4gR.jpeg) Enjoy! --- --- url: /blog/post/2012-09-09-npm-run-script.md description: >- Just a handy reminder that you can add "arbitrary" commands to your node NPM modules without creating binaries. --- ![](/images/medium-export/1__ocEQaLmuC7iloRNrsFaD__Q.jpeg) Just a handy reminder that you can add "arbitrary" commands to your node NPM modules without creating binaries. One of my favorite features of NPM is the option of installing packages either globally or locally (with local as the default). Do you want a special version of [forever](https://github.com/nodejitsu/forever) for project-\_a and not project-\_b? You can! ```text global: `forever start app.js` local: `./node_modules/.bin/forever start app.js` ``` You can also add ./node\_modules/.bin to your path when developing to make this even easier. This saves me so many headaches when compared to Ruby. There’s no need for something like [Bundler](http://gembundler.com/) (which is a great solution to Ruby’s problems BTW). I need to explicitly opt do something globally, so I should be aware of what I am getting into. The downside of this is potentially wasted disk space for duplicate packages, but I am OK with that. Because of NPM’s philosophy of local execution, you may not want to create binaries for your packages, but still might want a way to call actions from the command line. NPM to the rescue! The following syntax can be used: ```text npm run-script #{package} #{command} ``` You all probably know that you can define a "run" and "test" actions for any package, but you can keep adding more. For example, here’s the relevant scripts block from [actionHero](http://actionherojs.com/): ```json { "scripts": { "start": "./scripts/actionHero", "startCluster": "./scripts/actionHeroCluster", "install": "./scripts/install", "test": "./node_modules/.bin/vows spec/* -v --dot-matrix", "generate": "./scripts/generate", "generateAction": "./scripts/generateAction", "generateTask": "./scripts/generateTask" } } ``` By default, NPM will use the "run", "start", "stop", "restart" and "test" actions from the directory you are currently in. However, you can run actions from any package available to you (either local or global) with run-script. actionHero uses this to crete new projects with the command "npm run-script actionHero generate". Note the syntax: ```text npm run-script #{package} #{command} ``` --- --- url: /blog/post/2014-01-04-on-actionhero-routing.md description: >- I’ve been seeing questions about how actionHero’s routes interact with the actions. Here’s a bit more of an explanation (which I’ll keep to… --- I’ve been seeing questions about how actionHero’s routes interact with the actions. Here’s a bit more of an explanation (which I’ll keep to date on the [ActionHero documentation wiki](http://www.actionherojs.com/docs)): This variables in play here are: ```js api.config.servers.web.urlPathForActions = "api"; api.config.servers.web.urlPathForFiles = "public"; api.config.servers.web.rootEndpointType = "file"; ``` Say you have an action called ‘status’ (like in a freshly generated actionHero project). Lets start with actionHero’s default config: api.config.servers.web.urlPathForActions = 'api'; api.config.servers.web.urlPathForFiles = 'public'; api.config.servers.web.rootEndpointType = 'file'; There are 3 ways a client can access actions via the web server. * no routing at all and use GET params: server.com/api?action=status * with ‘basic’ routing, where the action’s name will respond after the /api path: server.com/api/status * or you can modify this with routes. Say you want server.com/api/stuff/statusPage ```js exports.routes = { get: [{ path: "/stuff/statusPage", action: "status" }], }; ``` The api.config.servers.web.rootEndpointType is "file" which means that the routes you are making are active only under the /api path. If you wanted the route example to become server.com/stuff/statusPage, you would need to change api.config.servers.web.rootEndpointType to be ‘api’. Note that making this change doesn’t stop server.com/api/stuff/statusPage from working as well, as you still have api.config.servers.web.urlPathForActions set to be ‘api’, so both will continue to work. If you want to shut off access to your action at server.com/api/stuff/statusPage and only allow access via server.com/stuff/statusPage, you can disable api.config.servers.web.urlPathForActions by setting it equal to null (but keeping the api.config.servers.web.rootEndpointType equal to ‘api’). --- --- url: /blog/post/2012-12-13-domains-and-nodejs.md description: Solved! --- ![](/images/medium-export/1__7MU7Ftjk6tpnGoDdNGLkfw.jpeg) ### Solved! The reason for this question was to ensure that in actionHero exceptions thrown after a call to the api.cache methods would still be caught by the domain they should have been in. [Here is the commit](https://github.com/evantahler/actionHero/commit/c5ebfc0e819cc0ed18e3ebb86c09f32b35406d73) which force-binds callbacks from the redis client back to the domain they should have been in. This is needed due to the fact that connection-pooled clients (which were created before the domain) will always revert back to their original scope. ### The Question: I’ve been having trouble lately with domains in node.js, in that I have found a few occasions where what is ‘in scope’ confuses me. Here’s a collection of tests to illustrate my I set up the test to have a domain which I will run each test in, and I expect all of the tests to throw an error and to be caught by the domain’s on(‘error’) event. I chose to use a redis client here (because it’s common), but I do not think that this is a problem with the awesome redis package, and I’ve observed similar behavior with the mysql / seq All of the tests work except #4, which throws an out-of-domain exception and causes the script to crash ```js var domain = require("domain"); var redis = require("redis"); var eventEmitter = require("events").EventEmitter; var tests = []; var testCounter = 0; var runTest = function () { if (tests.length > testCounter) { tests[testCounter](); } else { console.log("all done!"); process.exit(); } }; var myDomain = new domain.create(); myDomain.on("error", function (err) { console.log("Yo, I just saved you from the error: " + err); testCounter++; runTest(); }); // PASSING tests[0] = function () { myDomain.run(function () { throw new Error("A simple error"); }); }; // PASSING tests[1] = function () { myDomain.run(function () { setTimeout(function () { process.nextTick(function () { var E = new eventEmitter(); E.on("thing", function () { throw new Error("A deeply nested error"); }); setTimeout(function () { E.emit("thing"); }, 100); }); }, 100); }); }; // PASSING var Emm = new eventEmitter(); Emm.on("thing", function () { throw new Error("Emmited Error defined outside of scope"); }); tests[2] = function () { myDomain.run(function () { setTimeout(function () { Emm.emit("thing"); }, 100); }); }; // PASSING tests[3] = function () { myDomain.run(function () { clientA = redis.createClient(); clientA.hget("hash", "key", function (err, data) { throw new Error("An error after redis (A)"); }); }); }; // PASSING tests[4] = function () { clientB = redis.createClient(); myDomain.run(function () { clientB.hget("hash", "key", function (err, data) { throw new Error("An error after redis (B)"); }); }); }; // FAILING clientC = redis.createClient(); tests[5] = function () { myDomain.run(function () { clientC.hget("hash", "key", function (err, data) { throw new Error("An error after redis (C)"); }); }); }; // start it up runTest(); ``` * Test #0 is a simple throw in the scope of running domian. Everything works great. * Test #1 is a very convoluted throw in the scope of the domain involving timeouts, process.nextTick, and event emitters (I was trying to break things). Node preforms like a boss, and the eventual throw is caught. * Test #2 defines the emitter outside of the domain, but invokes the event from within it. Still the error is caught. * Test #3 creates the redis client within the scope of the domain. The error is caught and all is well * Test #4 creates the redis client OUTSIDE of the domain’s scope, but within a containing function. This error is caught by the domain * Test #5 fails, and the redis client here is created before the tests begin to run. In all of these cases, I would have hoped for the domain to catch the exceptions. I also would have also expected tests #4 and #5 to behave the same way regardless of whether or n Can anyone help me to explain why test 5 fails? domain can have their exceptions caught by it. --- --- url: /blog/post/2016-12-23-on-ethical-ad-supported-businesses.md description: >- Listen to this wonderful interview on KQED (NPR)’s "Forum" with Dr. Tim Wu On Advertising, Fake News and ‘Attention Harvesting’ --- Listen to this wonderful interview on KQED (NPR)’s "Forum" with Dr. Tim Wu On Advertising, Fake News and ‘Attention Harvesting’ [**REBROADCAST: Tim Wu On Advertising, Fake News and 'Attention Harvesting'**](https://ww2.kqed.org/forum/2016/12/23/rebroadcast-tim-wu-on-advertising-fake-news-and-attention-harvesting/) Tim Wu is a professor at Colombia University who focuses on the science and history of advertising. He gives a good summary of: * How Media has often been ad-supported, going back to the "penny papers" of the early 1900’s, and cheap access to information is, on principal, a good thing. The information revolution probably would not have happened otherwise. * How the perfect storm of ad-supported media (both digital and traditional) optimize for clicks/views and can get into ethically questionable areas * How politics is an attention game, and Trump is a master at it, being both a billionaire and a celebrity. * And finally how political "Fake News" is a logical conclusion of the above. When discussing the Google and Facebook’s recent actions to combat Fake News, Dr. Wu does agree that when you advertise that you are selling "News", you should be held to a higher standard than other types of advertisements; that there is a moral imperative behind the word "News", which I agree with. However e*nforcing* this moral imperative is a rabbit hole of first-amendment ridden problems… I think there will always be a chicken + egg problem with "Fake News", no matter how vigilant Google and Facebook are. **The media and news ad-supported business models we have are, by nature opaque. They therefore create a situation where both the real and fake advertisers are playing the same game by the same rules.** Dr. Wu closes the interview with an offhand remark about how, when he’s feeling optimistic, he can see a better advertising future in which consumers are treated more fairly with regards to what they are consuming. To paraphrase: > *If I walk into a shoe store and see that they are charging $1,000 for a pair of shoes, I know that is not a transaction I want to be involved in. When I watch the Olympics, I know I’ll be watching \*some\* advertising, but is 6 minutes of commercials every 1/2 hour worth it? Maybe I would rather pay for this content…* This is a technology blog. I am a nerd who makes websites. I was fascinated by Dr Wu’s final comments about what he postulates future advertising might look like. It seems to center around the idea of choice, and removing the opaqueness from the process. I wanted to mock up what it would look like to give consumers an *ethical advertising* *choice*. ![](/images/medium-export/1__xK9bq4wdiLuVelPx__MeqrA.png) OR ![](/images/medium-export/1__JjPI1__tSIP__7rv5N0p__v9g.png) * Notice how there are no in-page ads at all when I’ve *elected* to pay for the content. * I am being presented with a real, live exchange rate between USD and "*Ad Time*", which in turn presents me with the information I need as a consumer to make an informed *purchasing* decision. Yes, consuming ad-supported content is still a purchasing choice, and we should make that clear. * Should I not have opted to pay for this content, I would **be generating $0.05/hour for the content creator**, rather than spending it\*\*.\*\* * The content producers do not need to change their business model at all. Should they elect to remain ad-supported, they can! Either way, they add a `` tag to their site, and off they go. The existing ad delivery networks handle the rest, including payouts to the content creators. There are a few examples I can think of where a company offers a hybrid "free + ads" or "paid and no ads" choice to the consumer. Spotify, Hulu, Pandora, etc… but they are all *entertainment* companies. I know that my choice to spend $10/month on Spotify Premium probably saves me from listening to a few hours of Ads each month… and I can make an informed consumer choice to pay it. I do not know of a news organization that gives me the same option. The mockup above is something that we have the technology to implement today, should we want to. It could be account-based (you \*must\* stay signed in to sites’ ad delivery partners at all times), or it could be a baked into the browser, which restricts access to certain content once a maximum "time balance" has been reached. What’s fascinating is that I don’t need to pay the advertisers in *money*. Say I’ve accumulated a balance of $5.00 by browsing around the web for a month. At the end of the month I have a choice: * Either pay $5.00 with my credit card, which the ad delivery network will then pay a portion back to the content sites I’ve visited * I can change my mind, and want to "pay" for the content with ads again. I can then watch 30 minutes of ads, all in one place. Once the advertisers of those ads pay the ad delivery network, then in turn, the sites I’ve visited will be paid. Option #2 is interesting because I’ve effetely separated my advertisement experience from the content. I’ve finally removed any bias I would feel from the ads with regard to the news, by deferring the "ad cost" until later. Is this a good idea? Or is this a vision of the end of the anonymous web, and a vision of a dystonic future? The *exchange rate* ($0.05/hr) was calculated as follows: * Assume the average person spends [50 minutes on Facebook each day](http://www.businessinsider.com/how-much-time-do-people-spend-on-facebook-per-day-2016-4) which I rounded to 1 hour/day * The average [CPC (cost per click) for a Facebook ad in Q3 2016 was $0.27](https://adespresso.com/academy/blog/facebook-ads-cost/) * Lets assume the average person clicks on one Facebook ad a week (I can’t find any good data on this assumption), so 1 ad click would accumulate every 7 hours of browsing Facebook. * I added a 20% safety factor($0.01) to account for inaccuracy in these assumptions Yes, it would cost me ~$1.50 to read the New York Times every day for 1 hour in a month… and I think I’m OK with that. --- --- url: /blog/post/2013-10-14-on-inclusion.md description: The topic of the day is inclusion. --- The topic of the day is inclusion. How can we be sure we are building a community around our brand/product/software which we are certain is free of bias and accessible to all? How can we be sure that our conferences, evangelists, and recruiters don’t alienate one group at the expense of the comfort (or even safety) of another? As the leader of a ( [fledgling](https://groups.google.com/forum/#!forum/actionhero-js) ) open source community, I dream of meet-ups and conferences. But does my super-hero inspired software project skew too male? Is the [cursing in my commit messages](http://www.commitlogsfromlastnight.com/) going to turn people away? Maybe. I am a forgetful person, who is often "neglectful" of empathy, and a person who will probably offend you without meaning too. I don’t have a global solution, and as white male from the suburbs of the USA, I am aware that I am [playing the game on easy](http://whatever.scalzi.com/2012/05/15/straight-white-male-the-lowest-difficulty-setting-there-is/) . #### This is my official ask for you to let me know if I’m doing something wrong. Please [let me know](http://blog.evantahler.com/pages/contact.html) if I am doing something to hurt or offend you. I am truly worried that I might be turning people away without realizing it. --- --- url: /blog/post/2012-01-14-on-node-js-and-phidgets.md description: >- For those of you who don’t know what Phidgets are, they are small robotic prototyping boards which can handle (binary) input and output… --- ![](/images/medium-export/1__HxL8cV__wrobhi5strUoFuQ.jpeg) For those of you who don’t know what [Phidgets](http://phidgets.com) are, they are small robotic prototyping boards which can handle (binary) input and output, along with a collection of analog sensors for temperature, rfid, light, etc. I’ve been using them since [graduate school](http://etc.cmu.edu/) and they are really fun to work with. A "Normal" Phidget requires a computer, as it is a USB device. There are drivers for most operating systems now, and the most common method of interaction is via C++ of Java extensions on the local machine that hosts the USB port. Recently, they have also included a web-service (sockets) which can be used to talk to the Phidget board. This method of connection has a much higher latency than direct C++ bindings, but I find that for most of the use cases I have, it works just fine (push a button, sense something at some frequency, etc). So I thought to myself: Why not connect Phidgets to NodeJS? #### Here is the result \[[github](https://github.com/evantahler/nodePhidgets)] & \[\[npm]] This package will expose a data object to your application which can be used to periodically poll state of the inputs, and events will fire which you can choose to listen for phidgets.on("data"). There are methods which handle connecting and changing the state of the outputs as well. Enjoy! --- --- url: /blog/post/2013-03-24-on-task-systems.md description: Why Task Systems? --- ![](/images/medium-export/1__cvQ1iEN5JVT__bUpcd1tDvg.png) ### Why Task Systems? I have been thinking a lot about task-queue systems lately. A queue system is integral to any modern application as it can preform jobs in the background like sending email, processing bulk updates, etc. We all know that Cron is bad and has no place in this modern world of ours, and we need a distributed way of handling these actions :D At [work](http://www.taskrabbit.com) we recently undertook the (successful!) migration from [DelayedJob](https://github.com/collectiveidea/delayed_job) to [Resque](https://github.com/defunkt/resque) and investigated many other options. We also have a custom add-on to Resque (to be released soon) which transforms resque into a multi-app pub/sub message bus. I also recently had the chore opportunity to [re-write ActionHero’s task system](https://github.com/evantahler/actionHero/pull/89) to include support for more modes. I have been thinking about task systems a lot. ### Tasks vs Messages: I want to make the distinction up front about what I consider a **Task System** vs. a **Message System**. To me, the main destination is the notion of a delay and deliverability. With a task system, you assume that there is a queue of actions to be eventually preformed which is likely to, at times, fill faster than can be worked. With a task system you also make the assumption that each message/action/event/task **needs** to be inspected, and **must** be delivered to a broker/worker (with the alternative being the message is broadcast to whomever is available to hear it, including a collection of 0 listeners). **tldr**: A Message system delivers the message to anyone who is listening during the broadcast, while a tasks system enqueues the message until the recipient can read it. ### Queues The most important thing a task system needs is a persistent atomic list to store tasks. This can be a database table or an array residing in-memory of some application (a [list](http://redis.io/commands#list) or set in redis). Usually, you want to back this up to disk. Next, the list needs the property that it is somehow possible that items can be added uniquely to it and removed uniquely. That is to say that if a task is added, one-and-only-one copy of that task will ever be present, and that it is possible to remove one-and-only-one item from that last. For Redis based task systems (like Resque and sidekiq), Redis is a single-threaded app, and allows a worker to pop off an element from a list. Being single threaded, this means that 2 clients who request a pop at the exact same time will be ordered and handled in turn. If there was only one item to pop, only one client would get the data. Tasks can be added by a similar push method. For Database-backed queues, it’s still possible to ensure an "atomic" pop, but it’s a little trickier. Lets assume we have table with the following structure: ```text id | created_at | data | run_at | locked_by | locked_at -------------------------------------------------------- ``` This structure is very similar to that created by DelayJobs. Adding a task to a relational database is as simple as a normal insert statement. Most relational Databases have the property that if you do not supply a primary key, one will be uniquely assigned for you, and this ID can be thought of at the place within the task list. Working the queue is a different matter. A worker simply cannot select id from tasks where run\_at >= NOW() AND locked\_by IS NULL; because this can easily result in a race condition because the acts of select and update (where a worker would claim a task) are not atomic. Putting them within a transaction is also no good, as you can’t read and make decisions on the result (is the result of the select null?). To solve this, there is actually a multi-step process. First the worker needs to select an eligible task per the sql statement above. DelayedJob actually selects 5 tasks at once in case one of the jobs is claimed by another worker during this next step. The worker then writes a locker\_by but not a locked\_at. This indicates that the worker intends to work this job, but hasn’t started yet. The worker then reads the job again to ensure that their lock is still there (it’s possible for 2 workers to be on this step for the same job). If the lock is still there, then the worker can lay a locked\_at, and start working the job. [Here’s how DJ does it for ActiveRecord (mysql, postgres, etc)](https://github.com/collectiveidea/delayed_job_active_record/blob/master/lib/delayed/backend/active_record.rb). This process is slower than a redis pop, but generally effective. The interesting thing about DelayJob’s approach (when compared to the Redis version) above is that the task never ‘leaves’ the queue even when it’s being worked on. That means that if a worker crashes, the data is not lost, and can be examined. It is certainly possible for a redis worker to add back his task’s information, but if there will be a short while between the pop and a subsequent write. There are backends for DelayJob which use noSQL databases (mongo), and they work almost exactly the same way, since these databases don’t have a unique "pop" function. ### Priority and Multiple Queues A pattern Resque made famous was the notion of queues and priority. By default, rescue creates "high", "medium", "default", and "low" queues. This allows you to add normal tasks to the "default" queue, but if you have something important you want want to be worked ASAP, you can add it to the "high" queue and it will be worked by the next available worker. When you launch a a worker, you can also tell it which queues to work in order. You can also say "\*", and work all queues in alphabetical order. ### Delayed Tasks Sometimes tasks need to be delayed. If I want to "check a service in 10 minutes" or "send a welcome email 15 minutes after a user signs up", you need not only the above notion of priority, but also of chronology. Database-backed task systems have this easy. Add a column with a timestamp (run\_at above), and just check that against the current time (.. AND where "run\_at" <= NOW() ..) in the pop’s where clause. Easy! Redis-based systems have it harder. You don’t want to be popping each and every task, de-serializing it’s data, checking the timestamp, and then (most of the time) returning it to the queue. This is both dangerous and time consuming. [resque-scheduler](https://github.com/bvandenbos/resque-scheduler) solves this problem by keeping n timestampe’d queues for tasks that haven’t run yet (and a list of those queues in a hash). This allows for the scheduler to move a task over to the "working" queue whenever its time is ready. ### Recurrent Tasks There are a certain subset of tasks which repeat. A task like "make a database backup" or "email the marketing teal stats" repeat with a fixed frequency. Both DelayJob and Resque don’t have support for this type of task natively, but conceptually if you have delayed tasks, you can always re-enqueue yourself after you finish. Resque Scheduler added support for a "schedule" in later releases that lets you define a [yml list of tasks to run](https://github.com/bvandenbos/resque-scheduler#scheduled-jobs-recurring-jobs), and when to run them. ### Duplicatable tasks This last type of task takes us back to cron. This is the type of task I really do want to run "every day at midnight" on all my servers. This would be a task like "delete the /tmp directory’s contents" or "restart apache". While seemingly simple, this is the hardest type of task to implement in a task system, as it tends to go against the general point of an "atomic, first-come, first serve" queue. As far as I know, there isn’t a mechanism for this in either DJ or Resque, but I did recently add one to ActionHero ### My ideal task system (AKA How [ActionHero](http://actionherojs.com) does it) When building ActionHero’s task system, I took a look at all the types of tasks above, and tried to sort out how to build a system that could accommodate all these options. Luckily, I had some great existing systems I could borrow steal concepts from. #### Anti-paterns I wanted to correct * **The need to run separate workers from web servers** * In a event-based world (node, twisted-python, event-machine), the same argument for handling many event-driven clients per process can be said about task workers. "You spend most of your times waiting for IO, so why not do something else while you wait?" This is the main selling point of sidekiq when compared to rescue. Taking the metaphor further, why not have the same process able to handle both web requests **and** tasks? ActionHero does this. * It is likely that tasks will be more CPU intensive than web requests, so ActionHero limits how many ‘worker timers’ can be running at once, rather than ‘as many as I can handle’ which is the case for web requests. You can also run the server and disable its web servers to use it just as a worker. * This also makes deployments simple since running the server and the worker is the same thing! #### **The need to run any process "uniquely"** * [resque-scheduler](https://github.com/bvandenbos/resque-scheduler) is a great addition to the Resque ecosystem, but you must only run 1-and-only-1 instance of it at a time, which is odd and introduces another single point of failure. * The notion that only one worker can move a task from "delayed" to "ready" seems silly to me if the tasks are still kept in a pop-able while they are delayed #### **All task data should not be stored in the queue, but somewhere more permanent**. * References to a task need to be kept uniquely in a queue, but the task data itself doesn’t have to be. Why not keep it in a hash with a ‘state’ field? Doing this allows for a few benefits: * if a task is lost / crashed there will still be some record of what it was * you can follow the status of a task as it evolves * less information to be constantly poping and moving around * separation of ‘metadata’ from ‘action’ ### ActionHero’s data model actionHero defines tasks in /tasks: ```js exports.sayHello = { name: "sayHello", description: "I say hello", scope: "any", frequency: null, run: function (api, params, next) { api.log("Hello, " + params.name); next(null, true); }, }; ``` and you can invoke the task in your code like so: ```js var task = new api.task({ name: "sayHello", runAt: new Date().getTime() + 30000, // run 30 seconds from now params: { name: "Evan" }, // any optional params to pass to the task }); // and then you can either task.enqueue() or tasks.run() ``` First, note the metadata. scope is the toggle for defining if the tasks is duplicatable or not. "any" means that any worker can work this task atomically. "all" means that the task will be worked by all servers currently running, and duplicated accordingly (I’ll explain how later). Next, you can choose to either run or enqueue the task. The difference between run and enqueue+now is that run() invokes the task in-process rather than passing it to the first available worker. This is useful if you really need to invoke a task **now** for a request. runAt is an optional parameter to define when the task should be run (which defaults to now). This isn’t a guarantee that the task will be run exactly at that time (as all the workers might be busy), but rathe that the task is "ready" to be run at that time. This solves the "delayed tasks" problem. Finally the frequency option. This is used to denote if a task is recurrent, meaning if the task should re-enqueue itself when it completes. ActionHero uses redis to store its task data. To enable "all" task scopes, this means that every server (not worker) needs a ‘local’ queue of what to work on in addition to the global list. To enable the runAt parameter, there needs to be a ‘holding pen’ for tasks which aren’t ready ro run yet. This leafs us to: ```js api.tasks.queues = { globalQueue: "actionHero:tasks:global", delayedQueuePrefix: "actionHero:tasks:delayed", localQueue: "actionHero:tasks:" + api.id.replace(/:/g, "-"), data: "actionHero:tasks:data", // actually a hash workerStatus: "actionHero:tasks:workerStatus", // actually a hash enqueuedPeriodicTasks: "actionHero:tasks:enqueuedPeriodicTasks", }; ``` and the worker’s run the following pattern: ```js api.taskProcessor.prototype.process = function (callback) { var self = this; clearTimeout(self.timer); self.processLocalQueue(function () { self.processGlobalQueue(function () { self.processDelayedQueue(function () { self.setWorkerStatus("idle", function () { if (self.running == true) { self.timer = setTimeout(function () { self.process(); }, self.cycleTimeMS); } if (typeof callback == "function") { callback(); } }); }); }); }); }; ``` Interestingly, the only step that actually "preforms" the task is self.processLocalQueue. The rest of the methods move the task’s pointer from queue to another. Remember that the actual job’s data is stored in the data hash. This allows us to do things like check enqueuedPeriodicTasks at boot and ensure that all the periodic tasks I know about (where frequency > 0) are in system somewhere by inspecting the data hash. Even currently processing jobs will be present. To run a duplicatable tasks, the worker which moves the job from the globalQueue to it’s own localQueue will also place a duplicate copy in every other server’s queue. For delayed tasks, the delayedQueuePrefix is used to create a number of queues of the form delayedQueuePrefix + timestamp. A queue may hold a number of jobs which can be run at that time. Rather than checking if every individual tasks is ready to be run, we can simply use the names of the delayed queues and compare them to the current time. As the delayed queues are still atomic lists, it safe for many workers to pop from them at once, and re-insert the job’s pointed into the globalQueue to be worked like normal. #### TODO * Right now, there is no concept of ‘priority’ ("high", "medium", "low") in ActionHero’s task system. That’s next on the list! ### Fin. I did not have a goal in sharing my thoughts on task systems with you, but hopefully the comparison of how various systems work will be helpful as you choose what to use for your next project. *Originally published at 24 Mar 2013* --- --- url: /blog/post/2019-10-15-online-offline-sync.md description: Welcome to the third installment of The Illustrated Actionhero Community Q&A! --- Welcome to the third installment of The Illustrated [Actionhero](https://www.actionherojs.com/) Community Q\&A! Every week in October I’ll be publishing a conversation from the [Actionhero Slack community](http://slack.actionherojs.com/) that highlights both a feature of the Actionhero Node.JS framework and the robustness of the community’s responses… and adding some diagrams to help explain the concept. ### Online and Offline Sync October 14th, 2019 [Source conversation in Slack](https://actionherojs.slack.com/archives/C04EVSUSD/p1568327323188400) Community member NRK asked: > Hey everyone, I could use some high-level pointers. I’ve been looking at AH for a game where I need to keep track of user’s amount of money, which changes based on different stuff even when they’re offline. I’ve been looking at AH tasks for this and then just pushing the new state to connected users periodically. I’d like to do these cycles (both updating and pushing) as often as possible, ideally less than a second apart. Am I on the right track? Core-contributor Chad Responds: > *@nrk It’s a little hard to say for sure with only this 100,000-foot view but many of us here in this channel have dealt with online/offline sync use cases and can help out if you run into trouble. ActionHero is already sounding like a good base for you because with it you’ll get two things most other platforms don’t provide:* > 1. Redis-backed cross-cluster RPC and pub/sub. Most other frameworks require you to build this yourself, but AH includes it by default. It provides simple APIs for sending messages "to find the user if they’re connected anywhere on the cluster" and doing other things like that. > 2. Multi-protocol server support. So you get REST/Web like anything else. But you also get WebSockets out of the box — nothing to add. Just turn it on and the same actions run on either one. But you ALSO get a very easy way to add custom protocol servers. A raw TCP socket is also included but others have done protobuf and other protocols recently and had good success with that. > *So just napkin-drawing here, you might envision an architecture where your clients connect via raw TCP or WebSocket to a node in your cluster, and execute actions there. As their actions generate consequences, you can broadcast messages to the cluster to "find" those users and update them live if they’re connected. You can also use the tasks layer to do background/async work whether users are online or offline (batch, scheduled, triggered by other users, etc.) and not have to know if the users is there. If they are, they get a live update. If they aren’t, you arrange for them to start with an initial status message when they next connect.* > *AH includes some other meta concepts that make certain features easy to add, such as chat room mechanics. That can be used for what it’s named for (making chat rooms) but also for other things that work the same way even if they’re named differently (like monitoring a currency or stock symbol — each stock could be a "room" and the system could broadcast price updates to that "room"to any users monitoring it).* > *Thanks for taking the time to type this up and help out. Sounds perfect! I’ll definitely give it a go then. Love the docs and tutorial with examples so far!* The architecture described by Chad could be drawn like this: ![](/images/medium-export/1__G93TtL608iyLjBzns6aWyA.png) And, as a reminder, "Chat Rooms" can do far more than just send text messages back-and-forth! ![](/images/medium-export/1__6t6nYRSZIwZ__EjbUDLp7zw.png) > The diagram in [Production Notes](https://docs.actionherojs.com/tutorial-production-notes.html) is under-appreciated, in my opinion. It’s really what separates ActionHero as a framework from other more simpler ones like Express/Sails/etc. Express is just a framework. It does what it does and you can write software it it. The smarts are good, but they only do what they do at face value. ![](/images/medium-export/1__IxKe__qGKhWivj2W2CvNYYA.png) > ActionHero is a framework that enables an architecture (shown above). Any but the simplest apps eventually need to scale, and when they do you have to figure out how multi-server work is going to happen. The moment you do you see how weak things like Express are because you have to write so much code yourself just to get a message from user A on node A over to node B where their recipient is. When you go read "tutorials" like [this article](https://www.manifold.co/blog/building-a-chat-room-in-30-minutes-using-redis-socket-io-and-express-9e8e5a578675), and realize how much code they just wrote to do what AH did out of the box. To wrap it up, Chad writes: > It’s common in many AH setups for admins to want to separate "front-end" from "batch" services. For instance you might have an action `create:account` that sends an email to the user. Sending the email might take a bit of time and involve some retrying with your mailgun / sparkpost setup. So `createAccount()` can make the user in the DB and return the session, then call `enqueueTask('welcomeEmail', {params})` to send the welcome email. And you could configure a "worker" server to do that. > This is all optional. You could just run everything on a single box, or maybe just two main nodes for DR Actions, Tasks, the whole shebang The point is if you WANT to have "front-end nodes vs. batch processing nodes" you can. Some orgs really want this, because their front-end nodes get DDoS’d a lot (or they get a lot of traffic). Back-end (worker) nodes aren’t typically visible to the Internet so they aren’t affected by attacks. Workload requirements may also vary. For instance, batch processing might need a ton of memory and disk (uploading videos from a video chat site to Youtube for archival) while front-end nodes may be more CPU intensive handling tons of user calls in parallel. So maybe you want your task workers to be beefy memory-heavy systems with big disks, and your front-end nodes to have tiny disks and lots more CPU. > This is what I alluded to above: Actionhero is a framework that enables an architecture. You get the code, but you also get the ability to operationalize your app and scale it as you grow. Chad has build a number of [high-throughput applications](https://www.medialantern.com/) with Actionhero. He’s done a great job of explaining the high-level architecture of how you might use & scale Actionhero to manage complex backend state, and how to architect your applications. --- --- url: /blog/post/2013-01-07-owncloud-and-dreamhost.md description: Intro --- ![](/images/medium-export/1__DL3CdUDzttWP5FxDvsO9hg.png) ### Intro I have a lot of photos and MP3s. I want to keep my collections in-sync on all my computers and devices. I **don’t** want to use specialized service for each type of media (flikr for photos, iTunes Cloud for music, etc), and more importantly, often these services require that your media become published. I have no problem with publishing SOME of my media, often to social networks, but I don’t want *all* of it to be published. Tools like Google Drive or DropBox are what I want, but I don’t want to pay that much for them. [I once tried to write a little dameon that kept things in sync over SFTP](https://github.com/evantahler/synchzor), but I didn’t get to far with it. [Today I tried out the open-source version of ownCloud](http://owncloud.org/), and I was blown away. It was exactly what I wanted. ### OwnCloud #### Features that mattered to me: * DropBox-like daemons you can run on Mac and PC to keep things in sync * The ability to choose multiple folders to keep in sync, and which remote folder to sync with. Almost more importantly, this means that you can keep folders on the server, upload to them, and then remove your local copies. This is a great option for archiving things away. * Access to all of your files via the web. You can do this through the OwnCloud http site, and I was happy to learn that you can just S/FTP into your server and look at the repo directly. You can optionally turn on server-side encryption, but I chose to leave it off for this purpose. I trust my SSH keys :D * Public sharing links. Right click on a file, get a crazy-hash url for folks to look at it. * iOs app ($0.99). Again, this app acts just like dropbox, in that you can browse your repo, and download files you might want to your phone, and upload images and videos. * Multiple users. While the user support isn’t perfect yet (groups are a little buggy), the notion that I can have family folders and personal folders is pretty awesome. I can have shared folders with some people (perhaps a shared music collection) and still have private folders for myself. #### Features that are cool, but that I don’t need (yet) * web MP3 player * calDav server * cardDav server The calDAV/cardDAV servers are interesting because you can use them as endpoints to not only host your own calendar and contact lis AND you can keep it in sync with all of your devices. Currently, this is a service I am happy to leave in Google’s hands, but its nice to know that it is there if I need it. I have the feeling that many of OwnCloud’s developers care strongly about owning their own data (rather than Google or Facebook) almost as much as the ability to store tons of files in the sky :D ### Setup OwnCloud only needs PHP 5 and a database to run. Simple! Database options are either mySQL or sqLITE right now, but again kudos to the team for picking the most widely available tools. With such simple requirements, I chose to host my instance on a shared [DreamHost](http://dreamhost.com/) account. This costs $10/mo for all the bandwidth and storage you might need (they say ‘unlimited’). It was easy enough to setup and install: just download the latest TAR, unzip into your site’s public folder, and the website walks you though the rest. The only "coding" I needed to do was to increase my PHP upload file size (which dream host defaults to 7mb on DreamHost). These days, doing this is quite simple. just make a /~/.php/5.3/phprc and add any settings you want to override the default php.ini with. I literally only needed to add: upload\_max\_filesize = 100M post\_max\_size = 100M Then, restart your php workers, and they will re-spawn and take on the new settings. Optionally, you can have DreamHost ‘https’ your site for another $3/mo. Given that I will be transmitting about 1/2 the contents of my hard drive to the server, this seemed like a good idea. You should do it too. ### Closing Thoughts As I write this, I am in the process of uploading about 5GB of photos to my new ownCloud server. I am simultaneously looking at the uploads on my phone and having another computer stay in sync and download them as soon as they are available. This works exactly like I wanted. In comparison to DropBox, paying $10/mo is about the same price as their 100GB plan. Right now, I have less than 100GB I want to sync, but I don’t think that will always be the case. I also was already paying DreamHost $10/mo for hosting some "real" websites, so that was already a sunk cost for me (you can host any number of websites on your plan), so only the $3/mo for https was new for this project. I also really like the feature that I can leave files on the server and not need them on any device as an archive. This is something I don’t think you can do with DropBox. Sure the UI needs some work, and the desktop app crashed a few times on me, but for a free product, ownCloud is wonderful. I can’t wait to see what they come up with next. --- --- url: /blog/post/2025-01-14-permissions-for-ai-use-cases.md description: AI can see too much if you aren't careful --- ![title card](/images/posts/2025-01-14-permissions-for-ai-use-cases/image-1.png) ![Our visualization for bringing permissions to your AI application.](/images/posts/2025-01-14-permissions-for-ai-use-cases/image-2.png) *Our visualization for bringing permissions to your AI application.* # Overview When building AI Pipelines, syncing who has access to the content is almost as important as syncing the content itself. In traditional data warehouse work, access to the tables in the warehouse is controlled by humans on the data/analytics team. If the viewer’s role is appropriate, they can view the table - e.g. the finance team can view the "purchases" table, but not the full “users” table, which contains PII. However, it will be common for AI applications to work against datasets of content, not just data - including text, videos, etc. This content is guaranteed to contain PII, sensitive business or financial information, and other “private” information. To that end, we can’t use the same human-controlled permission model as before, granting all users the same access to the “Google Drive Documents” table - their individual roles and access need to be preserved. Furthermore, when building secure AI applications, it is imperative that the context provided to the LLM only includes content that both the machine and end-user is allowed to see - relying on the AI itself to guard sensitive information has been regularly shown to be a [flawed approach](https://genai.owasp.org/llmrisk/llm062025-excessive-agency/). To this end, when thinking about AI applications, a multi-stage permission model will be needed: * Extracting permission, role, and identity along with the content * Filtering content into role-specific collections * Mapping Identities to streams * Making all requests to the context collection with the appropriate identity Not every source provides the ability to extract identities, roles, and permissions at the level we would like. This is why a 2-phase approach is needed, to allow some level of access control even for the worst sources. ## A note on pre vs post access lookup in the system of record The most secure way to check access would be to query the [system or record](https://research.google/pubs/zanzibar-googles-consistent-global-authorization-system/) (e.g. Google Docs APIs) at read time to confirm that the user still maintains access to the original document. This is a great approach when the user is requesting access to an individual resource (and the system of record is both performant and has high uptime). However, for AI use cases, which operate on either multiple resources (RAG) or aggregates (text-to-SQL), spending the time to look up every possible resource will be too high, and likely rate-limited. Therefore, we need to cache this information ahead of time to make it available to us at query time. This will introduce a few necessary bad properties into the system, notably permission lag (you might be able to see an item you shouldn’t see any more) and flattening (the nuance of the groups and ACLs in the source system will be lost and made more coarse) To mitigate the lag issue, we will be allowing users to sync the permission data/stream more often than the content itself. To mitigate the flattening issue, the context identity streams should not be used as a system of record, but rather as a metadata to the data stream. # Extracting permission, role, and identity along with the content Consider a Google Drive source. We will want to ingest all the presentations and documents available for our application, and maintain which of those documents each user has access to. Google Drive does provide APIs to load all of this information, so we can produce 2 streams of data: * **Files**: The documents themselves in textual (e.g. markdown) format we need for our AI applications, and related metadata. Included in this metadata are the IDs of users and groups who have access to it, as well as the file/bucket the content came from, created and modified timestamps, etc. * **Identities**: The “unrolled” list of users who are members of each and every possible group in the domain. The best sources will provide both of these streams. Each of these streams can be set to sync at different frequencies, and will likely be set to use different sync modes as well - Files will likely be incremental, but Identities will likely be full-refresh. All File streams will gain the following properties: * `allowed_identity_remote_ids` (`list[str]`) * `denied_identity_remote_ids` (`list[str]`) * `publicly_accessible` (`bool`) The job of unrolling group memberships is the job of the source, and the schema of the Idendites stream will at-least include: * `remote_id` (`str`) * `email_address` (`str`) * `member_email_addresses` (`list[str]`) # Filtering content into role-specific collections As an administrator of the ingestion pipeline, you may wish to use filters as a coarse way of adding role information to the dataset. For example, if your incoming dataset from Google Drive includes the original file paths of the documents, you may want to exclude any documents in the “exec” folder, as they are likely to be too sensitive. Or, you may want to split your dataset into “EU” and “USA” documents based on other pieces of metadata (e.g. folder name or group memberships), to provide limited or different information to different groups of users. This filtering step is especially useful when the source is not able to provide complete Identity and Role information - filtering can act as a stop-gap. # Mapping Identities to streams If the source is able to provide Identity and Role information, we need to join the streams together. This is done by way of a “mapping”, joining (in the SQL-sense) all the user and role information to the original content record so that we have an easy way to query who can access each item. This will be duplicative of the data (adding storage cost) with the goal of producing a faster runtime looking (much like an index). We have decided to use Email Addresses as the shared unit of identity across all sources. # Making all requests to the context collection with the appropriate identity It is finally time to use our data in an AI application. As the application/agent developer, you will choose to hit the RAG/Chat completion/Search/Aggregation APIs including a user’s email address or not. If an email address is provided, all the information provided back to you & the LLM will be further filtered to what they are allowed to access and the filtering provided by the context collection itself. Otherwise, they will have access to all the content in the (filtered) collection: ![Our visualization for bringing permissions to your AI application.](/images/posts/2025-01-14-permissions-for-ai-use-cases/image-3.png) --- --- url: /blog/post/2012-04-11-phonegap-and-push-notifications.md description: Mobile Apps with JS --- ![](/images/medium-export/1__yizkpDcoH__Bpah8nUy__0PA.jpeg) ### Intro #### Disclaimer: this article only applies to iOs… sorry Android and Blackberry folks. I have been developing a game which will be played on mobile phones. I chose to create the game using [PhoneGap](http://phonegap.com) ( or [Cordova](http://incubator.apache.org/callback/), its open-sourced cousin) rather than making a native application. If you don’t know, the quick pitch for PhoneGap is that you can make an html-5 application AND have access to the "hardware" (filesystem, contacts, camera, ect) via javaScript. There were are few reasons I made this choice: * I wanted to run the application on as many devices as possible, and rather than writing both Android and iOs code, I can write the app once * I can also have a mobile version of my app (with limited features) for free * I really didn’t have time to become an iOs expert for this project * My game doesn’t need a 3D engine nor is it "fast twitch" Now there are known limitations with PhoneGap (as there are with any framework), but the biggest ones for me included "being slower" than native apps and no built-in push support. I had started developing my game in the browser anyway (as it is where I can prototype the fastest), and I was able to run the game OK in the iPhone’s safari browser, so I made the leap that it would probably run OK in PhoneGap as well. My only hurdle to conquer was push notifications. ### Starting at the Server I’m a "back end" guy. You might have guessed that from my work on [actionHero](http://actionherojs.com/) (which I am OF COURSE using to power my game), but I always prefer to develop my game in an [API-FIRST method](http://api-first.com/) so that I know the system works, and then I can asynchronously build/iterate my front end to handle those actions. That meant deciding how I was going to send my Push notifications. I took a long hard look at [Urban Airship](http://urbanairship.com/), not only because they throw a great SXSW party, but because a lot of [companies I respect](http://www.taskrabbit.com/) tust them with their push messages. However, I was feeling cheap and wanted to see how hard building my own would be :D The answer (in node.js) is really simple! [Argon has built a wonderful package](https://github.com/argon/node-apn) you can use to send push messages. Because I already had a [task-queue system](http://blog.evantahler.com/actions-vs-tasks) ready to go in actionHero, it was a simple matter to build a task which would look in the database for pending message and shoot them out using the APN package. Any other action or process could simply drop in the message into the queue along with the userID and some metadata, and the message would be sent off in a short while. Conceptually, this seemed like all I would need. Lets pause for a minute and talk about what needs to happen for your iPhone to receive a push message. First, you need to install the app. When you install an app which has registered itself to receive push messages, you then need to allow it (via pop up on the first run or from the notification preferences) to receive these messages. Once your app is configured, you need to get your device’s unique ID to your server so you can send messages to the device. The unique ID is NOT the device id from [device.uuid](http://docs.phonegap.com/en/1.5.0/phonegap_device_device.md.html#device.uuid) in PhoneGap, but rather a serial number which is hardware-bound to your device. This took me a while to sort out. When your server wants to send a message, it needs to first authenticate using certificates generated via the Apple provisioning portal, and then send the message to the device ID. I actually had a hard time with my security keys ([which you mange here](https://developer.apple.com/ios/manage/bundles/index.action)). Every app you develop needs to have a unique key which will be used for distribution in the app store and for messaging, iCloud, etc. When you create a new appID you create a new certificate. You generate a random private key and a CSR (certificate signing request) on your local machine, upload it, and then Apple gives you back a public certificate. Your application name MUST match the certificate’s app name (and what you typed into the app ID generation page). This is how apple knows which app to send a push notification "about" because logging into the push server also requires that same key and cert file. Hopefully this explanation will save you a few headaches. [Here’s a great guide on the push notification ecosystem I found useful](http://www.raywenderlich.com/3443/apple-push-notification-services-tutorial-part-12). Now, the default formats which these certificates are in is hard for node.js to read, but you can convert them. Follow these steps (taken from the APN package) ```bash $ openssl x509 -in cert.cer -inform DER -outform PEM -out cert.pem $ openssl pkcs12 -in key.p12 -out key.pem -nodes ``` Now you can pass these new files in as options.key and options.cert. ### Getting the Device Key: a PhoneGap Extension Ok! We now have our keys and know how to connect to the Apple push server… now how do we get the device ID? It turns out that in PhoneGap 1.5, there is not a way to get it. BUT, the magic of phone gap is that it’s primary job is to pass data and events from the "iOs" layer of your app to the "Javascript" layer of your app and back again. That means that if you can get this data in iOs, you can pass it up the stack! [I have to give Dave Hiren credit for these next steps, as he has a really good write up on his blog.](http://davehiren.blogspot.com/2012/03/get-device-token-for-apple-push.html) Here’s the deal: We are going to create an iOs method to get our token. We need to make **/PushToken/PushToken.h** and **/PushToken/PushToken.m** We also need to update our main **AppDelagte.h** to be aware of our new ‘token’ variable These 2 methods handle the succes and failure cases of registering for to catch push messages. As a note, the iPhone simulator cannot receive push messages, and will always run the failure case. You will need to test this out on an actual phone or iPad. Finally a new method to pass our object up to the JS layer (aslo in **AppDelate.m**). This should be defined towards the end of the file) ### Javascript The neat thing about push notifications is that the last line we added is all you ned to register the app to receive them. This will automagically inform the OS to add the app to the notifications list, accept messages when the app is not running,and obey the settings in the notifications settings. You can see we enabled all types of messages (badge, sound, and alert). What all the code above did was allow JS to access the "getToken" iOS method. Cordova/PhoneGap has some methods to access these types of things, so here is my JS wrapper to map the device token: You also need to tell the **cordova.plst** to enable the plugin by adding ```text PushToken PushToken ``` Now our variable token is now available to us as device.token in javascript! We just made a PhoneGap extension. I chose to grab and upload the device token upon login and keep a table of which userID is associated with each token. I have both an iPad and an iPhone, and I want to get notifications on both. This means that when I send a message to an account, I need to expect that each account may have more than one device. It is also best to expect that a device may change hands, and so I also need to check re-assign existing device tokens to accounts on log in as well. ### What about in-app messages? When testing my app, getting messages when the app was closed or not running worked as I expected by firing off the default sound effect and showing the message on the top of the screen. However, when I was in the app, I never got any alerts! That is because the application itself it supposed to handle incoming messages when active… something that PhoneGap doesn’t do out of the box. Using what I learned above, I realsed that I could also create another method to keep and store the most recent message’s information. The steps are similar to before in that you need to: new file: **/LastPushMessage/LastPushMessage.h** New File: **/LastPushMessage/LastPushMessage.m** Add our new variable to **AppDelegte.h** @property (retain, nonatomic) NSString\* LastPushMessageMessage; Instantiate that same variable within **AppDelegate.m** You will note I really only care about the "human readable" string, here but this can be extended. Just like before, I now have access to the iOs "getLastPushMessage" method. Now, you will not see events thrown by incoming messages, so I made a method to poll for new messages every minute which will alert the user: You also need to tell the cordova.plst to enable the plugin by adding ```text LastPushMessage LastPushMessage ``` to the plugins section, Note that you need to escape the content of the message, as it will be HTML encoded. You also don’t have to poll continuously for new messages, and can call app.getLastPushMessage whenever you want, but you might miss some. Also remember, your app will only ever see messages related to itself. Other app’s messages will be handled in the normal "pop up" way when they aren’t in focus. **The code is [here](https://gist.github.com/evantahler/bf5b2430544d37ad5aa0a0d3fdc12974)** Done! --- --- url: /blog/post/2012-02-15-pivotal-tracker-and-nerf-guns.md description: Nerf Guns and Product Management --- ![](/images/medium-export/1__j7dGWlXBg__zVQtw0tHlE2A.jpeg) #### Update: It seems that the good folks on Hacker News also like the internet and nerf guns (who knew!?). [There is a growing collection of other folk’s hacks regarding testing, project management, and foam weaponry](http://news.ycombinator.com/item?id=3594587) \*\* *** [In previous posts](http://blog.evantahler.com/on-nodejs-and-phidgets), I have talked about how to use [node.js](http://nodejs.org/) to talk to a phidget board. I have even talked about [how to run node.js ON a phidget board](http://blog.evantahler.com/node-js-running-on-a-phidgets-sbc2-board). Now you may be wondering what kind of projects you might do with a small embedded computer that has a solid http I/O stack. Here is a "suggestion". At [ModCloth](http://modcloth.com) we used [pivotal tracker](http://www.pivotaltracker.com), an awesome agile project management tool from the folks over at [Pivotal Labs](http://pivotallabs.com/). I have fallen in love with the tool for a number of reasons, but the most important reason for this article is how they really "get" the modern web-development workflow, and the tool fits nicely. They have tons of third-party integrations (Jira, GitHub, ect) and a great API. As a Product Manager with a remote team, I was always thinking about ways for our team to keep in touch better. While I was working on the phidget library, I thought I might use it and the tracker API to ring a small gong (or something similar) whenever a story was accepted as a proxy for me being able to quickly yell "thanks" (a story is a "small unit of work" for the uninitiated, like an item on your to-do list). I never found a gong, but I did find a [motorized nerf gun](http://www.amazon.com/Nerf-N-Strike-Vulcan-EBF-25-Blaster/dp/B0013U95U2)… and you can imagine where this is going :D Unfortunately, I didn’t get this project done in time before I left ModCloth, but I’m sure that I’ll find a use for it in the future! The first step was to hack the gun so that I could control it "digitally" without needing to squeeze the trigger. As it turns out, children’s toys are REALLY well made these days, and it took about an hour to pry everything apart. I forgot to take detailed photos, but what I ended up doing was soldering 2 new contact wires to the bridge on the on/off switch, and the motor itself. This was done in series with the trigger’s gate so it still worked and would complete the same circuit. I drilled a small hole in the base of the handle for the new wires to come out ![](/images/medium-export/0__yE4FPa__KrztRYOYt.jpeg) The gun used the trigger to transfer the current rather than be a relay for the motor, and this meant that my new cables would be carrying a significant amount of amperage from its 6 "D" Batteries. I couldn’t just complete the circuit with the phidget board (it would fry the computer, or at least be grounded out). Luckily, I had [a 10-amp relay board](http://www.phidgets.com/products.php?product_id=3051) which worked with the phidget board. Now with everything wired up, I needed a way to talk to the pivotal tracker API. [Wizcorp](https://github.com/Wizcorp/node-pivotal) had made a basic polling node.js wrapper the Tracker V3 API which did what I needed. Thanks WizCorp! The application logic is simple: * Connect to the Phidget board * Authenticate with Pivotal Tacker * Poll your project every so often and look for story state-changes * "fire" if a story you own was rejected. You can also use a similar mechanism for story acceptance. The next steps in this project are obviously to pour your engineers a drink automatically when a story is accepted :D Here is the application (written for node v0.6.x). This code is pretty terrible (as I limited myself to 4 hours for this project, including the construction), and lacks any failure / error handling, but hopefully it is easy enough to understand. ```js var phidgets = require("phidgets").phidgets; var pivotal = require("pivotal"); // curl -d username=$USERNAME -d password=$PASSWORD -X POST https://www.pivotaltracker.com/services/v3/tokens/active var apiToken = "XXXXXXXXXXXXXXXXXXX"; var projectID = 12345; var OwnedBy = "Evan Tahler"; var checkTimerMS = 500; var newer_than_version = 0; var PhidgetHost = "phidgetsbc.local"; var phidgetsReady = false; var phidgetData = {}; var init = function (next) { phidgets.on("data", function (type, id, value) { if (phidgetsReady) { console.log("phidgets >> " + type + " #" + id + " now at @ " + value); } }); phidgets.on("log", function (data) { if (typeof data != "object") { console.log("phidgets log >> " + data); } }); phidgets.on("error", function (e) { console.log("phidgets error >> " + e); }); phidgets.connect( { host: PhidgetHost, }, function (p) { phidgetsReady = true; phidgets.setOutput(0, false); phidgets.setOutput(1, false); phidgetData = p; pivotal.useToken(apiToken); pivotal.getProjects(function (err, resp) { if (err != null) { console.log(err.desc); process.exit(); } else { var found = false; for (var i in resp.project) { var project = resp.project[i]; if (project.id == projectID) { found = true; break; } } if (found == false) { console.log( "This user does not have access to project #" + projectID, ); process.exit(); } else { checkForTrackerStatus(); next(); } } }); }, ); }; var checkForTrackerStatus = function () { var filters = { project: projectID, limit: 1, newer_than_version: newer_than_version, }; pivotal.getActivities(filters, function (err, activities) { if (err != null) { console.log(err.desc); process.exit(); } else { if (activities.activity != null) { if ( activities.activity.event_type == "story_update" && newer_than_version > 0 ) { pivotal.getStory( projectID, activities.activity.stories.story.id, function (err, story) { if (err != null) { console.log(err.desc); process.exit(); } else { if (story.owned_by == OwnedBy) { if (story.current_state == "rejected") { fireGun(); } else if (story.current_state == "accepted") { happyness(); } } } }, ); } newer_than_version = activities.activity.version; } setTimeout(checkForTrackerStatus, checkTimerMS); } }); }; var fireGun = function (next) { console.log("BANG!"); phidgets.setOutput(0, true); setTimeout(phidgets.setOutput, 1000, 0, false); if (typeof next == "function") { next(); } 100; }; var happyness = function (next) { console.log("Yay!"); phidgets.setOutput(1, true); setTimeout(phidgets.setOutput, 1000, 1, false); if (typeof next == "function") { next(); } }; init(function () { console.log("READY!"); 111; }); ``` ### And here is a [video of everything working](http://www.youtube.com/watch?v=d4KIeTTp3qA) Enjoy! --- --- url: /blog/post/2012-09-10-production-deployment-with-node-js-clusters.md description: How do you deploy a node.js app reliably? --- ![](/images/medium-export/1__VUhrsBC1AkJ0UDL4PgPG8Q.png) #### Update (12/18/2012) These ideas have now been formally incorporated into the [actionHero](http://actionherojs.com/) project. To learn how to launch actionHero in a clustered way, [check out the wiki](https://github.com/evantahler/actionHero/wiki/Running-ActionHero). #### Update (12/5/2012) While other servers also use SIGWINCH to mean "kill all my workers" it’s important to note that this signal is fired when you resize your terminal window (responsive console design anyone?). Be sure that only demonized/background process respond to SIGWINCH! ### Introduction I was asked recently how to deploy [actionHero](http://actionHerojs.com) to production. Initially, my naive answer was to simply suggest the use of forever, but that was only a partial solution. Forever is a great package which acts as a sort of Deamon-izer for your projects. It can monitor running apps and restart them, handle stdout/err and logging, etc. It’s a great monitoring solution, but when you say forever restart myApp you **will** incur some downtime. I’ve spent the past few days working on a full solution. **Footnote** — This is a \*nix (osx/linux/solaris) answer only. I’m fairly sure this kind of thing won’t work on windows. At [TaskRabbit](http://www.taskrabbit.com) (a Ruby/Rails Ecosystem) we have put in a lot of effort into "properly" implementing [Capistrano](https://github.com/capistrano/capistrano) and [Unicorn](http://unicorn.bogomips.org/) so that we can have 0-downtime deployment. This is integral to our culture, and allows us to deploy worry-free a number of times each day. This also makes the code-delta in our deployments smaller, and therefore less risky (saying nothing of the value in reducing the time it takes to launch new features). 0-downtime deployments are good. ### Framework Ok, so how to make a 0-downtime node deployment? Forever is certainly part of the solution, but the meat of the answer lies in the [node.js cluster module](http://nodejs.org/api/cluster.html) (and how awesome node is at being unix-y). The cluster module allows one node process to spawn children and share open recourses with them. This might include file pointers, but in our case, we are going to share open ports and unix sockets. In a nutshell, if you have one worker open port 80, other workers can also listen on port 80. The cluster module will share the load between all available workers. The cluster module is usually approached as a way to load balance an application (and it’s great at that), but it also can be used as a way to hand over an open connection from one worker to another. In this way, we can tell one worker to die off while another is starting. With enough workers running (and some basic callbacks), we can ensure that there is always a worker around to handle any incoming requests ### Application Considerations This is some core node magic right here. Whether you have created an HTTP server or a direct TCP server, the default behavior of [server.close()](http://nodejs.org/api/net.html#net_server_close_cb) is actually quite graceful by default. Check out the docs and you sill see that the server will close, but not kick out any existing connections, and finally when all clients have left, a callback is fired. We will be waiting for this callback to know that it is safe to close out our server. For an HTTP server this is pretty straight forward: no new connections will be allowed in, and any long-lasting connections will have the chance to finish. In our cluster setup, that means that any new connections that come in during this time will be passed to another worker… exactly what we want! (note: it’s possible that a connection might not ever finish, but that’s out of scope for this discussion) Raw TCP connections are another matter. The server behaves the same way, but TCP connections never really expire, so if we don’t kick out existing connections, the server will never exit. Take a look at this snippit of code from actionHero’s socketServer: ```js api.socketServer.gracefulShutdown = function (api, next, alreadyShutdown) { if (alreadyShutdown == null) { alreadyShutdown = false; } if (alreadyShutdown == false) { api.socketServer.server.close(); alreadyShutdown = true; } for (var i in api.socketServer.connections) { var connection = api.socketServer.connections[i]; if (connection.responsesWaitingCount == 0) { api.socketServer.connections[i].end("Server going down NOW\r\nBye!\r\n"); } } if (api.socketServer.connections.length != 0) { api.log( "[socket] waiting on shutdown, there are still " + api.socketServer.connections.length + " connected clients waiting on a response", ); setTimeout(function () { api.socketServer.gracefulShutdown(api, next, alreadyShutdown); }, 3000); } else { next(); } }; ``` In the part of the TCP server that handles incoming requests, we increment the connection’s connection.responseWaitingCount and when the action completes and the response is sent to the client, we decrement it. This way we can approximate the client is "waiting for a response" or not. It’s important to remember that TCP clients can request many actions at the same time (unlike HTTP, where each request can only ever have one response). Note that once a client is deemed fit to disconnect we send a ‘goodbye’ message. The client then is responsible for reconnecting, and they will come back and connect to another worker. WebSockets work the same way as the TCP server does. Once we disconnect each client, they will reconnect to a new worker node, as the old one has stopped taking connections. socket.io’s browser code is very well written and will reconnect and retry any commands that have failed. [socket.io](http://socket.io/) binds to the http server we talked about earlier, so shutting it down will also disconnect all websocket clients. Now that we have servers that gracefully shut down, how do we use them? ### The Cluster Leader The reason for gracefully disconnecting each client was that we are not going to restart each server, but rather kill it entirely and create a new one. Creating a new worker ensures that each process will load in any new code and have a fresh environment to work within. The Cluster leader has a few main goals: * **Sharing Resources**. We get this for free from the node.js cluster module * **Worker management**. This includes restarting them when they fail along with responding to requests to start/stop them * **Responding to Signals**. We’ll cover this in a moment, but essentially this is you you communicate with the leader once he is up and running. * **Logging**. You gotta’ log the state of your cluster! ### Sharing Resources: As mentioned before, open sockets and ports can be shared (for free) by all children in the cluster. Yay node! ### Worker Managment: The cluster module provides a message passing interface between leader and follower. You can pass anything that can be JSON.stringified (no pointers). We can use these methods to be aware of when a booted worker is ready to accept connections, and conversely, we can tell a worker to begin its graceful shut down process (rather than outright killing the process). Take a look at the worker code at the bottom of the article for more details. Note the use of process.send(msg) within the callbacks for actionHero.start() and actionHero.stop(). ### Responding to Signals: Unix signals are the classy way to communicate with a running application. You send them with the kill command, and each signal has a common meaning: * **SIGINT / SIGTERM / SIGKILL** Kill the running process, with various degrees of severity. In our application we will treat these all as meaning "kill the leader and all of his workers" * **SIGUSR2** Tell the leader to reload the configuration of his workers. In our cluster, this will mean a rolling restart of each worker one-by-one. * **SIGHUP** reload all workers. For us, this will mean instantaneously kill off all workers and start new ones (will lead to potential downtime) * **SIGWINCH** kill off all workers * **TTIN / TTOU** add a worker / remove a worker So if you wanted to tell the leader to stop all of his workers (and his pid was 123), you would run kill -s WINCH 123 USR2 is the most interesting case here. While there are ways to "reload configuration" in a running node.js app (flush the module cache, reload all source modules, etc), it’s usually a lot safer just to start up a new app from scratch. I say that we are going to do a "rolling restart" because we literally are going to kill off the first worker, spawn a new one, and repeat. Assuming we have 2 or more workers, this means that there will always be at least one worker around to handle requests. Now this might lead to problems where some workers have an old version of your codebase and some workers have a new version, but usually that is desirable when compared with outright downtime. Oh, and try not to have more workers than you have CPUs! The main function in charge of these "rolling restarts" is here: ```js var reloadAWorker = function (next) { var count = 0; for (var i in cluster.workers) { count++; } if (workersExpected > count) { startAWorker(); } if (workerRestartArray.length > 0) { var worker = workerRestartArray.pop(); worker.send("stop"); } }; cluster.on("exit", function (worker, code, signal) { log("worker " + worker.process.pid + " (#" + worker.id + ") has exited"); setTimeout(reloadAWorker, 1000); // to prevent CPU-splsions if crashing too fast }); ``` When we initialize a rolling restart, we add all workers to the workerRestartArray, and then one-by-one they will be dropped. Note that on every worker’s exit, we run reloadAWorker(). This also ensures that if a worker died due to an error, we will start another one in its place (workersExpected is modified by TTIN and TTOU). The reason for the timeOut is to ensure that if a worker is crashing on boot (perhaps it can’t connect to your database) that the leader isn’t restarting workers are fast as possible… as this would probably lock up your machine. ### Deployment Notes * While your workers will load up new code changes on restart, the cluster leader will not. Unfortunately, you will need to restart it to catch any code changes. Luckily, you can use forever for this to do it quickly. When you restart the leader all of his workers will die off. * Building from the earlier comment, child process die when the parent dies. That’s just how it is (usually). That means that if for any reason the leader dies (kill, ctrl-c, etc), all of the workers will die too. That’s why it is best to keep the leader’s code base as simple as possible (and separate from the workers). This is also why we use something like forever to monitor it’s uptime and restart it if anything goes wrong * [I’ve talked about using Capistrano to deploy node.js applications before](http://blog.evantahler.com/deploying-node-js-applications-with-capistrano), but there are lots of methods to get your code on the server (fabric, chef, github post-commit hooks, etc). Once your leader is up and running, your deployment really only looks like 1) git pull 2) kill -s USR2 (pid of leader). Yay! ### Code Here is the state of actionHero’s cluster leader code at the time of this post. It’s likely to keep evolving, so [you can always check out the latest version on GitHub](https://github.com/evantahler/actionHero/blob/leader/scripts/) ### Follower ```js #!/usr/bin/env node // load in the actionHero class var actionHero = require(__dirname + "/../api.js").actionHero; // normally if installed by npm: var actionHero = require("actionHero").actionHero; var cluster = require("cluster"); // if there is no config.js file in the application's root, then actionHero will load in a collection of default params. You can overwrite them with params.configChanges var params = {}; params.configChanges = {}; // any additional functions you might wish to define to be globally accessable can be added as part of params.initFunction. The api object will be availalbe. params.initFunction = function (api, next) { next(); }; // start the server! var startServer = function (next) { if (cluster.isWorker) { process.send("starting"); } actionHero.start(params, function (api_from_callback) { api = api_from_callback; api.log("Boot Sucessful @ worker #" + process.pid, "green"); if (typeof next == "function") { if (cluster.isWorker) { process.send("started"); } next(api); } }); }; // handle signals from leader if running in cluster if (cluster.isWorker) { process.on("message", function (msg) { if (msg == "start") { process.send("starting"); startServer(function () { process.send("started"); }); } if (msg == "stop") { process.send("stopping"); actionHero.stop(function () { api = null; process.send("stopped"); process.exit(); }); } if (msg == "restart") { process.send("restarting"); actionHero.restart(function (success, api_from_callback) { api = api_from_callback; process.send("restarted"); }); } }); } // start the server! startServer(function (api) { api.log("Successfully Booted!", ["green", "bold"]); }); ``` ### Leader ```js #!/usr/bin/env node ////////////////////////////////////////////////////////////////////////////////////////////////////// // // TO START IN CONSOLE: `./scripts/actionHeroCluster` // TO DAMEONIZE: `forever start scripts/actionHeroCluster` // // ** Producton-ready actionHero cluster example ** // - workers which die will be restarted // - maser/manager specific logging // - pidfile for leader // - USR2 restarts (graceful reload of workers while handling requets) // -- Note, socket/websocket clients will be disconnected, but there will always be a worker to handle them // -- HTTP, HTTPS, and TCP clients will be allowed to finish the action they are working on before the server goes down // - TTOU and TTIN signals to subtract/add workers // - WINCH to stop all workers // - TCP, HTTP(s), and Web-socket clients will all be shared across the cluster // - Can be run as a daemon or in-console // -- Lazy Dameon: `nohup ./scripts/actionHeroCluster &` // -- you may want to explore `forever` as a dameonizing option // // * Setting process titles does not work on windows or OSX // // This example was heavily inspired by Ruby Unicorns [[ http://unicorn.bogomips.org/ ]] // ////////////////////////////////////////////////////////////////////////////////////////////////////// ////////////// // Includes // ////////////// var fs = require("fs"); var cluster = require("cluster"); var colors = require("colors"); var numCPUs = require("os").cpus().length; var numWorkers = numCPUs - 2; if (numWorkers < 2) { numWorkers = 2; } //////////// // config // //////////// var config = { // script for workers to run (You probably will be changing this) exec: __dirname + "/actionHero", workers: numWorkers, pidfile: "./cluster_pidfile", log: process.cwd() + "/log/cluster.log", title: "actionHero-leader", workerTitlePrefix: " actionHero-worker", silent: true, // don't pass stdout/err to the leader }; ///////// // Log // ///////// var logHandle = fs.createWriteStream(config.log, { flags: "a" }); var log = function (msg, col) { var sqlDateTime = function (time) { if (time == null) { time = new Date(); } var dateStr = padDateDoubleStr(time.getFullYear()) + "-" + padDateDoubleStr(1 + time.getMonth()) + "-" + padDateDoubleStr(time.getDate()) + " " + padDateDoubleStr(time.getHours()) + ":" + padDateDoubleStr(time.getMinutes()) + ":" + padDateDoubleStr(time.getSeconds()); return dateStr; }; var padDateDoubleStr = function (i) { return i < 10 ? "0" + i : "" + i; }; msg = sqlDateTime() + " | " + msg; logHandle.write(msg + "\r\n"); if (typeof col == "string") { col = [col]; } for (var i in col) { msg = colors[col[i]](msg); } console.log(msg); }; ////////// // Main // ////////// log(" - STARTING CLUSTER -", ["bold", "green"]); // set pidFile if (config.pidfile != null) { fs.writeFileSync(config.pidfile, process.pid.toString()); } process.stdin.resume(); process.title = config.title; var workerRestartArray = []; // used to trask rolling restarts of workers var workersExpected = 0; // signals process.on("SIGINT", function () { log("Signal: SIGINT"); workersExpected = 0; setupShutdown(); }); process.on("SIGTERM", function () { log("Signal: SIGTERM"); workersExpected = 0; setupShutdown(); }); process.on("SIGKILL", function () { log("Signal: SIGKILL"); workersExpected = 0; setupShutdown(); }); process.on("SIGUSR2", function () { log("Signal: SIGUSR2"); log("swap out new workers one-by-one"); workerRestartArray = []; for (var i in cluster.workers) { workerRestartArray.push(cluster.workers[i]); } reloadAWorker(); }); process.on("SIGHUP", function () { log("Signal: SIGHUP"); log("reload all workers now"); for (var i in cluster.workers) { var worker = cluster.workers[i]; worker.send("restart"); } }); process.on("SIGWINCH", function () { log("Signal: SIGWINCH"); log("stop all workers"); workersExpected = 0; for (var i in cluster.workers) { var worker = cluster.workers[i]; worker.send("stop"); } }); process.on("SIGTTIN", function () { log("Signal: SIGTTIN"); log("add a worker"); workersExpected++; startAWorker(); }); process.on("SIGTTOU", function () { log("Signal: SIGTTOU"); log("remove a worker"); workersExpected--; for (var i in cluster.workers) { var worker = cluster.workers[i]; worker.send("stop"); break; } }); process.on("exit", function () { workersExpected = 0; log("Bye!"); }); // signal helpers var startAWorker = function () { worker = cluster.fork(); log("starting worker #" + worker.id); worker.on("message", function (message) { if (worker.state != "none") { log("Message [" + worker.process.pid + "]: " + message); } }); }; var setupShutdown = function () { log("Cluster manager quitting", "red"); log("Stopping each worker..."); for (var i in cluster.workers) { cluster.workers[i].send("stop"); } setTimeout(loopUntilNoWorkers, 1000); }; var loopUntilNoWorkers = function () { if (cluster.workers.length > 0) { log("there are still " + cluster.workers.length + " workers..."); setTimeout(loopUntilNoWorkers, 1000); } else { log("all workers gone"); if (config.pidfile != null) { fs.unlinkSync(config.pidfile); } process.exit(); } }; var reloadAWorker = function (next) { var count = 0; for (var i in cluster.workers) { count++; } if (workersExpected > count) { startAWorker(); } if (workerRestartArray.length > 0) { var worker = workerRestartArray.pop(); worker.send("stop"); } }; // Fork it. cluster.setupleader({ exec: config.exec, args: process.argv.slice(2), silent: config.silent, }); for (var i = 0; i < config.workers; i++) { workersExpected++; startAWorker(); } cluster.on("fork", function (worker) { log("worker " + worker.process.pid + " (#" + worker.id + ") has spawned"); }); cluster.on("listening", function (worker, address) {}); cluster.on("exit", function (worker, code, signal) { log("worker " + worker.process.pid + " (#" + worker.id + ") has exited"); setTimeout(reloadAWorker, 1000); // to prevent CPU-splsions if crashing too fast }); ``` Enjoy! --- --- url: /blog/post/2020-02-18-production-node-applications-with-docker.md description: >- Tips and tricks to shut down your docker applications properly. No more lost data! --- ![](/images/medium-export/1__lOqD__NHM7gQY9Ax9TF4wZg.jpeg) Recently, I’ve been noticing that a high number of folks using [Node Resque](https://github.com/actionhero/node-resque) have been reporting similar problems relating to the topics of shutting down your node application and property handling uncaught exceptions and unix signals. These problems are exacerbated with deployments involving Docker or a platform like [Heroku](https://heroku.com/), which uses Docker under-the-hood. However, if you keep these tips in mind, it’s easy to have your app work exactly like you want it too… even when things are going wrong! I’ve added a Docker-specific example to Node Rescue which you can check out here , and this blog post will dive deeper into the 3 areas the the example focuses on. Node Resque is a background-job processing framework for Node & Typescript which stores jobs in Redis. It support delayed and recurring jobs, plugins, and more. Node Rescue is a core component of the [Actionhero](https://www.actionherojs.com) framework. ### 1. Ensure Your Application Receives Signals, AKA Don’t use a Process Manager You shouldn’t be using NPM, YARN, PM2 or any other tool to "run" your application inside of your Docker images. You should be calling only the node executable and the file you want to run. This is important so that the signals Docker wants to pass to your application actually get to your app! There are lots of [Unix signals](https://www.tutorialspoint.com/unix/unix-signals-traps.htm) that all mean different things, but in a nutshell it’s a way for the Operating System (OS) to tell your application to do something, usually implying that it should change its lifecycle state (stop, reboot, etc). For web servers, the most common signals will be `SIGTERM` (terminate) , `SIGKILL`(kill, aka: "*no really stop right now I don’t care what you are working on*") and `SIGURSR2`(reboot). Docker, assuming your base OS is a \*NIX operating system like Ubuntu, Red Hat, Debian, Alpine, etc, uses these signals too. For example, when you tell a running Docker instance to stop `docker stop`, it will send `SIGETERM` to your application, wait some amount of time for it to shut down, and then do a hard stop with `SIGKILL`. That’s the same thing that would happen with docker kill — it sends `SIGKILL` too. What are the differences between stop and kill? That depends on how you write your application! We’ll cover that more in section #2. So how to you start your node application directly? Assuming you can run your application on your development machine with `node ./dist/server.js`, your docker file might look like this: ```docker FROM alpine:latest WORKDIR /app RUN apk add —update nodejs nodejs-npm COPY . . RUN npm install CMD ["node", "/dist/server.js"] EXPOSE 8080 ``` And, be sure you don’t copy your local `node_modules` with a `.dockerignore` file: ```bash node_modules \*.log ``` We are using the `CMD` directive, not `ENTRYPOINT` because we don’t want Docker to use a subshell. `ENTRYPOINT` and `CMD` without 2+ arguments works by calling `/bin/sh -c` and then your command… which can trap the signals it gets itself and not pass them on to your application. If you used a process runner like `npm start`, the same thing could happen. You can learn more about docker signals & node here ### 2. Gracefully Shut Down your Applications by Listening for Signals Ok, so we are sure we will get the signals from the OS and Docker… how do we handle them? Node makes it really easy to listen for these signals in your app via: ```js process.on("SIGTERM", () => { console.log(`[ SIGNAL ] - SIGTERM`); }); ``` This will prevent Node.JS from stopping your application outright, and will give you an event so you can do something about it. … but what should you do? If you application is a web server, you might: 1. Stop accepting new HTTP requests 2. Toggle all health checks (ie: `GET /status`) to return \`false\` so the load balancer will stop sending traffic to this instance 3. Wait to finish any existing HTTP requests in progress. 4. And finally… exit the process when all of that is complete. If your application uses Node Resque, you should call `await worker.end()`, `await scheduler.end()` etc. This will tell the rest of the cluster that this worker is: 1. About to go away 2. Lets it finish the job it was working on 3. Remove the record of this instance from Redis If you don’t do this, the cluster will think your worker should be there and (for a while anyway) the worker will still be displayed as a possible candidate for working jobs. In [Actionhero](https://www.actionherojs.com) we manage this at the application level `await actionhero.process.stop()` and allow all of the sub-systems (initializers) to gracefully shut down — servers, task workers, cache, chat rooms, etc. It’s important to hand off work to other members in the cluster and/or let connected clients know what to do. A robust collection of process events for your node app might look like: ```js async function shutdown() { // the shutdown code for your application await app.end(); console.log(`processes gracefully stopped`); } function awaitHardStop() { const timeout = process.env.SHUTDOWN_TIMEOUT ? parseInt(process.env.SHUTDOWN_TIMEOUT) : 1000 * 30; return setTimeout(() => { console.error( `Process did not terminate within ${timeout}ms. Stopping now!`, ); process.nextTick(process.exit(1)); }, timeout); } // handle errors & rejections process.on("uncaughtException", (error) => { console.error(error.stack); process.nextTick(process.exit(1)); }); process.on("unhandledRejection", (rejection) => { console.error(rejection.stack); process.nextTick(process.exit(1)); }); // handle signals process.on("SIGINT", async () => { console.log(`[ SIGNAL ] - SIGINT`); let timer = awaitHardStop(); await shutdown(); clearTimeout(timer); }); process.on("SIGTERM", async () => { console.log(`[ SIGNAL ] - SIGTERM`); let timer = awaitHardStop(); await shutdown(); clearTimeout(timer); }); process.on("SIGUSR2", async () => { console.log(`[ SIGNAL ] - SIGUSR2`); let timer = awaitHardStop(); await shutdown(); clearTimeout(timer); }); ``` Let's walk though this: 1. We create a method to call when we should shutdown our application, `shutdown`, which contains our application-specific shutdown logic. 2. We create a "hard stop" fallback method that will kill the process if the shutdown behavior doesn’t complete fast enough, `awaitHardStop`. This is to help with situations where an exception might happen during your shutdown behavior, a background task is taking too long, a timer doesn’t resolve, you can’t close your database connection… there are lots of things that could go wrong. We also use an Environment Variable to customize how long we wait `process.env.SHUTDOWN_TIMEOUT` which you can configure via Docker. If the app doesn’t exist in in this time, we forcibly exit the program with \`1\`, indicating a crash or error, 3. We listen for signals, and (1) start the "hard stop timer", and then (2) call `await shutdown().` 4. If we successfully shutdown we stop timer, and exit the process with \`0\`, indicating an exit with no problems. **Note**: We can listen for any unix signal we want, but we should never listen for `SIGKILL`. If we try to catch it with a process listener, and we don’t immediately exit the application, we’ve broken our promise to the operating system that `SIGKILL` will immediately end any process… and bad things could happen. ### 3. Log Everything Finally, log the heck out of signaling behavior in your application. It’s innately hard to debug this type of thing, as you are telling your app to stop… but you haven’t yet stopped. Even after `docker stop`, logs are still generated and stored…. And you might need them! In the Node Rescue examples, we log all the stop events and when the application finally exists: ```bash docker logs -f {your image ID} … (snip) scheduler polling scheduler working timestamp 1581912881 scheduler enqueuing job 1581912881 >> {"class":"subtract","queue":"math","args":\[2,1\]} scheduler polling \[ SIGNAL \] - SIGTERM scheduler ended worker ended processes gracefully stopped ``` So, if you: 1. Ensure Your Application Receives Signals, AKA Don’t use a Process Manager 2. Gracefully Shut Down your Applications by Listening for Signals 3. Log Everything You should have no problem creating robust node applications that are deployed via Docker, and are a pleasure to monitor and debug. --- --- url: /blog/post/2016-10-28-productionizing-flynn.md description: Flynn is a an open-source self-hosted Heroku replacement… and it is great. --- ![](/images/medium-export/1__k9__uGIssdqP4JtbdaGeYNQ.png) [Flynn](https://flynn.io) is a an open-source self-hosted Heroku replacement… and it is ***great***. To me, it strikes the right balance of: * Running your apps in a modern, highly-available [12-factor way](https://12factor.net/) * Allows you deploy via either Docker or Git * Provides some of the underlying databases you might need Yes, when comparing it to some of the more elaborate Enterprise orchestration tools like [Kubernetes](http://kubernetes.io/) or even [Rancher](http://rancher.com/), it lacks some features and customizations, but when you want to route & run your apps (and you are SMB), it’s approach is a pleasure to work with. Personally, I’m more inclined to run my databases directly on other hosts so I can manage and back them up directly with something like Ansible\* so this is Flynn for my use-cases. I’m primarily talking about hosting user-facing web applications and API servers. *\* I don’t feel that Docker is ready for prime-time regarding persistent data, i.e. databases, but that is a whole other article…* This post contains the instructions to run a Flynn cluster at the *Root* of your domain. Doing so allows you to manage all of your applications (from your apex, `www.`, `api.` and whatever else you might want to host) from one cluster. Flynn is a new tool, so configuring it to work this way requires a few tweaks. The Flynn team is working to make this easier going forward… but for now, [install the Flynn CLI on your local computer](https://flynn.io/docs/cli), and lets go! The version of Flynn this was based on (along with many GitHub issue notes and the [Flynn documentation](https://flynn.io/docs/basics)) is `v20161015.2`. ### Creating your Flynn Cluster Even though I’m running my Flynn cluster on AWS, I do not like the built-in Flynn AWS installer. It uses cloudformation to create custom security groups, integrates directly with Route53, and generally does some magic which isn’t customizable yet. That said, you can create your own cluster from any Ubuntu 14 AMI, and use the SSH tool to bootstrap your cluster. It works great! Notes on bootstrapping your own cluster: * Create at least 3 servers, all with the same hard-drive size and Ubuntu 14 AMI. Create them in different availability zones. The same hard drive size is important and Flynn will create *identical* ZFS mounts on all of your hosts… so they need to match in capacity. * Every node needs port 443 (https) and 80 (http) open to the world. You will **not** be using a load balancer… this take some getting used to (more on this later in the DNS section). * Get Elastic IPs and apply them to every node in your Flynn cluster. We’ll be relying on these IPs later in the DNS section too. * Every node in the cluster needs to be able to talk to each other. The easiest way to to do that is to create a security group called "flynn" and allow it to have open access to *itself* using "All traffic"*.* ![](/images/medium-export/1__mrd9offHsy4fvMhvpkrFLQ.png) As you create external services, you can create new security groups for each one with the same self-access policy, ie *production-elasticsearch.* Only your Elasticsearch servers and then the Flynn nodes need it. Once you have your servers up and running and your security groups applied, you can bootstrap your cluster with the Flynn installer. \# On your local machine flynn install This will open your web browser to handle the rest: ![](/images/medium-export/1____uB__xf__wKUZhpSpLkDltQw.png) At the end of the install and bootstrap process, you are asked to download the certificate needed to access the dashboard. You don’t need to do this, we’ll be uploading our own shortly. ### Cluster Monitoring As of this writing, Flynn does not provide any monitoring or alerting on the cluster. This means that it is up to you to do so. I’m calling this out in its own section because it is important. You can use tools like DataDog, Monint, Server Density… whatever you like. One of the main reasons I like to not use the AWS Flynn cluster creation tool is that I prefer to bootstrap all of my hosts with these tools before joining them to the Flynn cluster. Are you running out of RAM? Hard Drive Space? How will you know until something goes wrong unless you are monitoring! ![](/images/medium-export/1__3s3xxNXA1jevIv5DJFITcw.png) Flynn does provide a status endpoint an external service can poll, IE: http://status.site.com. You can learn more about using it [here](https://flynn.io/docs/production#monitoring). ### DNS Flynn handles all of your routing for you. Any host can redirect a request to any other (mesh routing) to serve the client’s request. It is rather cool! To this end, you need to configure DNS in the following way: * **site.com** (apex domain) should be an A record with the (elastic/static) IPs of your Flynn cluster nodes * **\*.site.com** (wildcard subdomain) should be a CNAME to site.com In the manner, all of your future domains to be hosted by Flynn will resolve to your cluster. ![](/images/medium-export/1__4hoIxA__TYf0TysZyraQtsw.png) If you are using Route53 you can also configure [health checks](http://docs.aws.amazon.com/Route53/latest/DeveloperGuide/dns-failover.html) to drop/add an IP address from your A-record should they become unresponsive. ### Certificates When you create a Flynn cluster, they automatically create a DNS entry for you on *flynnhub.com*. This is quick way to get up and running with Flynn for testing purposes, but we want to run Flynn for *production*. We want our own URLs, and therefore, our own security certificates. First, purchase a wildcard certificate for your domain \*.site.com. At the time of this writing, the cheapest option I found was though N[amecheap](https://www.namecheap.com/). Download your **cert.pem** and **key.pem** and have them ready. 1. Tell Flynn about the new URL we want to use: ```bash flynn cluster migrate-domain site.com ``` This will also update all references to the cluster’s URL in your `~/.flynnrc` the place where Flynn stores information about connecting to the cluster. 2. Upload your new wildcard certificate to Flynn’s **router** application. All parts of Flynn are Flynn apps… it’s Flynn all the way down! We are using the Environment of the **router** to hold the body of our certificate ```bash flynn -a router env set TLSCERT="$(cat cert.pem)" TLSKEY="$(cat key.pem)" ``` 3. At this point, if you use the Flynn CLI again, you’ll run into a certificate error. Something like this ```bash Error writing CA certificate: Get https://dashboard.site.com/ca-cert: pinned: the peer leaf certificate did not match the provided pin ``` The reason for this is that if you look in `~/.flynnrc`, you’ll see that your cluster has the following information saved about it: ```text default = "mycluster" [[cluster]] Name = "mycluster" Key = "xxxx" TLSPin = "yyyyy" ControllerURL = "https://controller.site.com" GitURL = "https://git.site.come" DockerPushURL = "https://docker.site.com" ``` **TLSPin** was created against the old flynhub.com certificate (or the self signed one). **TLSPin** allows the Flynn client to connect to self-signed servers without errors. Either way, it no longer matches the new certificate you uploaded. To generate the new value of TLSPin, do the following and replace the output in `~/.flynnrc`: ```bash openssl x509 -inform PEM -outform DER < cert.pem | openssl dgst -binary -sha256 | openssl base64 ``` Once you update the value of TLSPin, we can continue using the CLI to update our routes with our new certificate. For the following apps we need to update their routes to use the new security certificate, since they were created by the bootstrap process, before the new cert was uploaded: **controller**, **dashboard**, **gitreceive,** and **docker-receive**. For each app, you will do the following: ```bash # 1) look at the routes for the app flynn -a controller route ROUTE SERVICE ID STICKY LEADER PATH https:controller.site.come controller http/ceacfb7d-e238-462a-aed0-9f4d1c087ea9 false false / https:controller.abc.flynnhub.com controller http/d8553b80-f49d-4496-bc76-80492ba81256 false false / # See how there are 2 routes? One for your domain, and one for flynnhub.com? Save the Route ID for *your domain* to a local shell variable #2) export FLYNN_ID=http/ceacfb7d-e238-462a-aed0-9f4d1c087ea9 # Update the route to include the SSL certificates flynn -a controller route update $FLYNN_ID --tls-cert cert.pem --tls-key key.pem ``` Be sure to do this for all 4 of the default *public* applications. As of this writing, you cannot identify a route by anything other than its ID (like name), but the Flynn team is working on that. ### S3 BlobStore The next thing to configure is using S3 as the "blobstore". The Bobstore is where Flynn stores all the things you upload to it. Docker Images, git repositories, etc. The reason for using S3 here is 2-fold: * Save the disk space on your Flynn hosts for other things, like application logs * Share the uploads with all of your hosts without needing to transfer the image around To do this, first create the S3 bucket you want to use. It should only be used for this purpose, and be in the same region as your hosts. Then, create an AMI user, note its AWS\_KEY and AWS\_SECRET and give it the following permissions: ```json { "Version": "2012-10-17", "Statement": [ { "Effect": "Allow", "Action": [ "s3:DeleteObject", "s3:GetObject", "s3:ListBucket", "s3:PutObject", "s3:ListMultipartUploadParts", "s3:AbortMultipartUpload", "s3:ListBucketMultipartUploads" ], "Resource": [ "arn:aws:s3:::YOURFLYNNBUCKET", "arn:aws:s3:::YOURFLYNNBUCKET/*" ] } ] } ``` Now, we’ll tell Flynn to use the S3 bucket and make it the default going forward. Again, all parts of Flynn are Flynn Apps, so we’ll just be updating the environment settings of the **blobstore** application: ```bash export AWS_KEY=ABC123 export AWS_SECRET=abc123 export AWS_REGION=us-east-1 export AWS_BUCKET=YOURFLYNNBUCKET flynn -a blobstore env set BACKEND_S3MAIN="backend=s3 region=AWS_REGION bucket=$AWS_BUCKET access_key_id=$AWS_KEY secret_access_key=$AWS_SECRET flynn -a blobstore env set DEFAULT_BACKEND=s3main ``` ### Apps & Routing Now that we are set up, we can deploy our applications! When you create Flynn app, the *name* you give it is also its default route. So if you want an application to be served at `www.site.com` then name your Flynn app `www` via ```bash flynn create --name www ``` You can also add many routes to your application, and you can always update them. For your apex domain, you can either add that route (site.com) to an app of your choice, or create a very simple app which redirects all traffic to your `www` app. An example can be found here in this 20 line [Node.js](https://medium.com/u/96cd9a1fb56) server: [**messagebot/redirector**](https://github.com/messagebot/redirector/blob/master/server.js) When you push your first application via git, you’ll probably notice an SSL/HTTPS error. Something like this: ```bash remote: Internal server errorfatal: unable to access 'https://git.site.com/project.git/': The requested URL returned error: 500 ``` This is because there is one more system setting that Flynn modifies: your `~/.gitconfig`. If you look at it you’ll see 2 entries Flynn created: A authentication block and a certificate block: ```text [http "https://git.site.com"] sslCAInfo = /Users/evan/.flynn/ca-certs/flynn-abc123.pem [credential "https://git.site.com"] helper = /usr/local/bin/flynn git-credentials ``` Flynn does this (just as in the cases of the CLI) so that the self-signed certificate will work. However, we don’t want this any more, so just delete the **sslCAInfo** section, as we are now using a real, legitimate, certificate. ### Access Security The final topic I want to address is access security. Access to your Flynn cluster is gated only by a single key: the cluster "Key" in `~/.flynrc` This key is used both by the CLI and by git. Look above at the settings in `~\.gitconfig` again. When authenticating to the Flynn’s git server, it runs a key helper (`flynn git-credentials`)on OSX. That ends up injecting an entry into your Keychain which looks like this: ![](/images/medium-export/1__F2ekvZKwbwc__bo6gEBwECQ.png) Yep, that’s the same KEY used again as a git password. This is fine, and is secure… but it has no notion of authorization. **All Flynn users have access to all parts of Flynn**, and there is **no way to revoke access to a specific user or app**. If you are small shop, this is probably OK. If not… you probably want to work for a later version of Flynn which introduces these features. The Flynn team has been very responsive, and they are working on per-user access controls. --- --- url: >- /blog/post/2019-12-30-proxying-free-heroku-dynos-though-cloudflare-to-a-custom-domain.md description: >- The Actionhero project runs a number of websites for the community — documentation, sample apps, etc. We often rely on Heroku’s free… --- The [Actionhero project](https://www.actionherojs.com/) runs a number of websites for the community — documentation, sample apps, etc. We often rely on Heroku’s free hosting to run these applications. Heroku makes it very easy to deploy your apps in a [12-factor](https://12factor.net/) way, attach databases, and deploy with a simple `git push`. It’s a great, free option to deploy your applications. The [Actionhero project](https://www.actionherojs.com/) uses Cloudflare as our DNS provider, as they provide a simple interface to enforce HTTPS, add caching, and a whole other host of features… and it’s also free! ![](/images/medium-export/1__VGUvH5BMZsoC0N8yR3Hn1g.png) One of the Cloudflare features we have enabled is `full` HTTPS encryption across the whole domain, meaning that traffic remains HTTPS’d between you and the Cloudflare proxy (which provides caching), and remains HTTPS’d from Cloudflare to our servers at Heroku. ![](/images/medium-export/1__0hz4MdwWncQQzSOwBrK5LQ.png) Heroku doesn’t provide HTTPS certificates for free dynos/apps like the ones we are using, except on their own domain, `*.herokuapp.com`. By default your app will be visible on a domain like `my-app.herokuapp.com`. However, you can opt to add custom domain to your app, like `my-cool-app.com` but… you can’t add HTTPS on the free Heroku tier. Your custom CNAME won’t support it. ![](/images/medium-export/1__ox5GvhIjeCKMAACAzSmj7Q.png) If you point your CNAME to this new custom Heroku DNS target, Cloudflare will keep directing the client to HTTPS, which Heroku won’t like… and you’ll end up with a "Too many redirects" error. ![](/images/medium-export/1__uL59CYZ1RV31uze3vgG2wQ.png) The way to solve this is to use the original CNAME target Heroku gave you even though it’s no longer listed in the interface. Point the CNAME in Cloudflare at `*.herokuapp.com` even though Heroku will tell you to use a new DNS target in the interface, and you will have encrypted traffic flowing in no time! CNAME chat.actionherojs.com chat-actionhero.herokuapp.com As an interesting notes, older Heroku apps keep showing the original `*.herokuapp.com` CNAME target in the dashboard, and not a new custom one, even with a custom URL added. This behavior confirms that the above is a good path to get things working! ![](/images/medium-export/1__PIUvryf2i6AbGHyFuDZM3g.png) --- --- url: /blog/post/2020-05-22-pull-the-data-you-actually-want.md description: >- Streaming event data becomes costly and less useful over time. Reverse ETL solves these problems. --- ![Pull doughnuts](/images/posts/2020-05-22-pull-the-data-you-actually-want/doughnuts.jpeg) There’s an underlying pattern prevalent today in many digital marketing tools that is causing problems. Wasted time, overpaying, slow velocity, and privacy issues for your customers are some of the results of this pattern. The problem is the over-reliance on Events. Specifically, the problem is that many marketing tools live in a world where they expect to be “pushed” data, when it would be so much better if they were “pulling” data when they needed it. In this blog post, we’ll explore the problems with event-based “Push” marketing & analytics tools, and how we can fix them by switching to a “Pull” based solution, like the one that [Grouparoo](https://github.com/grouparoo/grouparoo) is building. ## The Problems with “Push” and Events We have conducted interviews with over 70 marketing teams of all sizes throughout the world. What follows is a synthesis of some of the common problems they face with their current marketing tools. ### Overpaying One of the most telling critiques of event-based SaaS marketing tools is the fact that marketing teams are /actively/ pruning the events they are sending to control costs. Tools like Segment work best when you can build a robust profile of your customers' activity, but Segment charges you more when you store more data! Segment essentially holds their functionality hostage as you gain more users who in turn produce more events. Countless marketers lamented pain of having to decide which events to keep tracking versus which to stop tracking. Inevitably, a new campaign idea would hit them a few months later that would need the event that they stopped tracking. These teams and marketers just didn’t have the money to keep sending every event. This takes us to the Stale Data problem. ### Stale Data When using only events to model your customers, there’s a huge lag time between when you start capturing that data and when you can use it. For example, say you want to run a campaign targeting customers who haven’t purchased in 6 months. If you start sending `purchase` events in June, the soonest you can start your campaign is December. Not only is that a long time to wait, but you’ve also sacrificed your team’s agility to modify those events or the campaign while you wait, [minimizing the shots you get to take](https://www.grouparoo.com/blog/the-shots-you-get-to-take). ### Lost History In addition to the slow ramp up time for a new campaign based on events, there’s the problem of lost events. With poor mobile connections, errors on your web pages & apps, slow vendors, and increasingly prevalent ad-blocking tools, it’s very easy to lose an event. Every marketer we talked to had their own less-than-scientific process they used to explain discrepancies between events and product data. Everyone had a different process and no one really trusted an event-based data source. If events are the only way you model your customers, it can be devastating if you miss a `changed-email-address` event - you might never be able to reach your customers at all! Coupling this issue with the Stale Data problem, there’s no way to fully model customers you had before you started sending events. There’s no way to compare the profile you’ve built in Segment against your product database, where the customer’s data is actually stored with confidence. ### Privacy Nightmare Finally, there’s a challenging privacy story regarding events. You have a relationship with your customers and part of that relationship is based on trust. How sure are you that Segment or Mixpanel is storing your customers' data safely? How many other services do your events pass though on the way to them? Google Analytics or Google Tag Manager? AdMob? Facebook or Twitter Pixels? The list goes on and on. Any one of these vendors is a potential vector for attack, event theft, or event manipulation. With regulations like GDPR and CCPA and many more on the way, you have a clear legal responsibility to keep your customers' data safe. Part of that responsibility is ensuring that the companies you share data with will update, delete, and anonymize data when *you* ask them too. What’s your process for doing that for all of your past events? ## The future is "Pull" How would you build things if you were starting today? We would focus on pulling in data from our existing sources, of course! Now, imagine you were starting to build a new marketing tool from scratch, like [Grouparoo](https://www.grouparoo.com) is. You aren’t burdened by the poor legacy choices of Segment, Mixpanel, and the rest. How would you build things if you were starting today? We would focus on pulling in data from our existing data sources. With Grouparoo, events are a way to augment your customer profiles, not the main source of critical data. Grouparoo’s goal is to make it easy for non-technical members of your team to build robust customer profiles and groups. Then, they can synchronize that data in a safe way to communication and advertising partners. A highly-functioning digital marketing team can quickly add new profile properties, augment them with data from your product and data warehouse(s), and start running a new campaign in minutes, rather than weeks. ### Within your Firewall To be in the best position to pull in data, the Grouparoo application should be located within your company firewall. This means that Grouparoo should run on *your* servers and have (read) access to your databases. In this way, no data ever leaves your company’s control, and you can always inspect, audit, and change it. You don’t need to send your customer data over the open internet to a third party just to build a new cohort. ### Safe and Secure Since your customer data lives where it belongs on your servers, you can make changes and deletions as needed, and you can prove it. You can build a transparent GDPR/CCPA process at your company with Grouparoo. Grouparoo has robust access controls and logs to ensure that every profile and group is managed the way it should be. ### Cost Controls Since Grouparoo is running on your servers, there’s no additional cost to import more data. If you want to add more properties to your customer profiles to build more fine-grained groups and cohorts, then do it! Have you imported a facet you no longer need? Delete it! Grouparoo doesn't charge per event, or any other measure of data quantity, so you are free to experiment at will. ### Always up-to-date The “Pull” pattern has a wonderful property; Grouparoo can check up on the data whenever we are about to use it! This means that rather than waiting for an event to (possibly) be sent to Grouparoo to update a customer’s preferences or LTV, we can ask the primary source. Grouparoo will always re-import the properties of every profile before sending a communication to the customer, or syncing with a third party. This means that you can be sure that you’ll always be working with the latest customer preferences, purchase, email address, names, etc. when communicating with them. No lost events will prevent you from communicating effectively. ### Ready for ad-blocking technology by respecting consumer choices Finally, the pattern of tracking and sending **so many** events from our sites and applications is leading to more and more consumers blocking tools like Segment, Mixpanel, Amplitude, and Google Analytics from getting any events at all. In fact, [major browsers are now shipping with defaults to prevent third-party tracking and cookies](https://www.theverge.com/2020/3/24/21192830/apple-safari-intelligent-tracking-privacy-full-third-party-cookie-blocking). Rather than fight the modern privacy-conscious consumer because your marketing stack relies on events, embrace the choices they are making. Utilize your source-of-truth of their behavior and preferences you already have, your product database. ## Grouparoo’s Open-Source Promise There’s still a place for events in your marketing stack, but they shouldn’t be the primary source of any piece of customer data. Grouparoo’s goal is to embrace the “Pull” data model as much as we can, and this means making it as easy as possible for you to install and run it within your cloud. To that end, Grouparoo’s core product is **available for free**, under the [Mozilla 2.0 Open Source License](https://github.com/grouparoo/grouparoo/blob/main/LICENSE.txt). You can follow our progress [on Github](https://github.com/grouparoo/grouparoo) and join our developer community at [community.grouparoo.com](https://www.grouparoo.com/docs/community). --- --- url: /blog/post/2015-03-12-rebuilding-capistrano-with-ansible.md description: Solid Ruby deployments with Ansbile --- ![](/images/medium-export/1__w0iIGUfsxNXvUBKqrC__uSA.png) ### Introduction At [TaskRabbit](https://www.taskrabbit.com) we use [Ansible](http://www.ansible.com) to configure and manage our servers. Ansible is a great tool which allows you write easy-to-use playbooks to configure your servers, deploy your applications, and more. The "More" part was what led us to switch from Chef to Ansbile. While both tools can have a "provision" action, you can make playbooks for all sorts of things with Ansible, including application deployment! For the past 4 years, TaskRabbit was using [Capistrano](http://capistranorb.com/) to deploy our Rails applications. We built a robust and feature-rich set of plugins which: * changed the way we logged during deployments (better timestamps, shared the deployment logs with the rest of the team, etc) * became a rails module we could plug into each of our applications with minimal configuration changes * standardized some of our deployment steps * and codified our best practices (ie: cap staging deploy:migrations should work for all apps; All apps should wait for the Unicorns to reboot before clearing cache, etc) Eventually, we started adding more and more non-rails (Sinatra), and then non-ruby (Node.js) apps. I’ve [written before](http://blog.evantahler.com/blog/deploying-node-js-applications-with-capistrano.html) on how you can use Capistrano to deploy *anything*, including node.js applications. That said, at some point having a ruby dependency for a 500K node app seems silly… but at least we were consistent and clear how all of our projects were to be deployed. Any developer in the company, even if they never touched a line of node before, knew how the app was to be deployed to production. Then came Ansible. One of the things that always irked me about Capistrano was that it required duplication of data. Why do I need to keep a list of servers and roles in a deploy.rb file within each application when the authoritative source for that data is our provisioning tool (previously Chef-Server, now the ansible project’s [inventory](http://blog.evantahler.com/blog/2015/03/12/ansible-dynamic-static-inventory/))? Doubly so, every time we added or removed a node from chef, I need to be sure to update the deploy.rb. There are some tools out there which attempt to link Chef and Capistrano, but none of the ones I tried worked. More worrisome was the fact that some of the steps for deployment were duplicated in chef, or Chef was shelling out to Capistrano (which required a *full* ruby environment) to deploy. I’m happy to say that TaskRabbit now deploys all of our applications via Ansible, and no longer uses Capistrano. We were able to keep a homogenous command set and duplicate most of Capistrano’s features in very small amount of code. Here’s how we did it: ### Server Background * We deploy on Ubuntu 14 TLS servers. * We have a specific user, denoted by \`\` in these roles. * Our application directory structure exactly mirrors that of Capistrano (it’s a great layout), IE: ```text /home/{{ deploy_user }}/www/{{ application }}/ - current (symlink to release) - releases - timestamp_1 - app - config (symlinks to ../../shared/config) - tmp (symlink to ../../shared/tmp) - pids (symlink to ../../shared/pids) - timestamp_2 - timestamp_3 - shared - tmp - config (ymls and other config files previously config'd by ansible) - public - cached-copy (git repo in fullt) - logs - pids - sockets ``` We define inventories by RAILS\_ENV (or NODE\_ENV as the case may be), and then divide up each application to the sub-roles that it requires. I’ll be using the following example inventories/production file as reference: ```text myApp-web1.domain.com myApp-web2.domain.com myApp-worker1.domain.com myApp-worker2.domain.com myApp-redis.domain.com myApp-mysql.domain.com [production] myApp-web1.domain.com myApp-web2.domain.com myApp-worker1.domain.com myApp-worker2.domain.com myApp-redis.domain.com myApp-mysql.domain.com [production:vars] rails_env=production node_env=production cluster_env=production [myApp] myApp-web1.domain.com myApp-web2.domain.com myApp-worker1.domain.com myApp-worker2.domain.com [myApp:unicorn] myApp-web1.domain.com myApp-web2.domain.com [myApp:resque] myApp-worker1.domain.com myApp-worker2.domain.com # ... ``` ### Playbook and API The entry point to our deployment playbook is the deploy.yml playbook: ```yaml - hosts: "{{ host | default(application) }}" max_fail_percentage: 1 roles: - { role: deploy, tags: ["deploy"], sudo: no } - { role: monit, tags: ["monit"], sudo: yes } ``` and a rollback.yml playbook: ```yml - hosts: "{{ host | default(application) }}" max_fail_percentage: 1 tasks: - include: roles/deploy/tasks/rollback_symlink.yml - include: roles/deploy/tasks/restart_unicorn.yml - include: roles/deploy/tasks/restart_resque.yml ``` This allows us to have the following API options: * deploy one app to all staging servers (normal use): ```bash ansible-playbook -i inventories/staging deploy.yml — extra-vars="application=myApp migrate=true" ``` * deploy one app to 1 staging server ( — limit): ```bash ansible-playbook -i inventories/staging deploy.yml — extra-vars="application=myApp migrate=true branch=mybranch" — limit staging-server-1.company.com ``` * deploy myApp production: ```bash ansible-playbook -i inventories/production deploy.yml — extra-vars="application=myApp migrate=true" ``` The beauty of the line  ***— hosts: ""*** in the playbook is that you can reference the servers in question by the group they belong to, which in our case matches the application names, and then sub-slice the group even further via optional — limit flags. ### Variables To make this playbook work, we need a collection of application metadata. This essentially mirrors the information you would provide within an application’s deploy.rb in Capistrano. However, moving this data to Ansible allows it be used not only in both of the deployment/rollback playbooks, but also in provisioning if needed. Here’s some example data for our myApp application, which we can pretend is a Rails 4 application: From group\_vars/all ```yml applications: - myApp - myOtherApp application_git_url_base: git@github.com application_git_url_team: myCompany deploy_email_to: everyone@myCompany.com application_configs: myApp: name: myApp language: ruby roles: - unicorn - resque ymls: - database.yml - memcache.yml - redis.yml - facebook.yml - s3.yml - twilio.yml pre_deploy_tasks: - { cmd: "bundle exec rake assets:precompile" } - { cmd: "bundle exec rake db:migrate", run_once: true, control: migrate } - { cmd: "bundle exec rake db:seed", run_once: true, control: migrate } - { cmd: "bundle exec rake myApp:customTask" } post_deploy_tasks: - { cmd: "bundle exec rake cache:clear", run_once: true } - { cmd: "bundle exec rake bugsnag:deploy", run_once: true } resque_workers: - name: myApp workers: - { name: myApp-scheduler, cmd: "resque:scheduler" } - { name: myApp-1, cmd: "resque:queues resque:work" } - { name: myApp-2, cmd: "resque:queues resque:work" } #... ``` You can see here that we have defined a few things: * the configuration files needed for each app (that we place in /home/{{ deploy\_user }}/www/{{ application }}/shared/config as noted above) * metadata about the application, including the language (ruby) and the roles (unicorn and resque) * tasks to complete before and after the "deploy". The moment the "deploy" happens here is when the symlink for the current symlink switches over. ### The Role: Deploy roles/deploy/main.yml Looks like this: ```yml - include: init.yml - include: git.yml - include: links.yml - include: config.yml - include: bundle.yml - include: pre_tasks.yml - include: reboot.yml - include: post_tasks.yml - include: cleanup.yml - include: email.yml - include: hipchat.yml ``` Lets go through each step 1-by-1: ### init.yml ```yml - name: Generate release timestamp command: date +%Y%m%d%H%M%S register: timestamp run_once: true - set_fact: "release_path='/home/{{ deploy_user }}/www/{{ application }}/releases/{{ timestamp.stdout }}'" - set_fact: "shared_path='/home/{{ deploy_user }}/www/{{ application }}/shared'" - set_fact: "current_path='/home/{{ deploy_user }}/www/{{ application }}/current'" - set_fact: migrate={{ migrate|bool }} when: migrate is defined - set_fact: migrate=false when: migrate is not defined - set_fact: branch=master when: branch is not defined and cluster_env != 'production' - set_fact: branch=production when: cluster_env == 'production' - set_fact: keep_releases={{ keep_releases|int }} when: keep_releases is defined - set_fact: keep_releases={{ 6|int }} when: keep_releases is not defined - name: "capture previous git sha" run_once: true register: deploy_previous_git_sha shell: > cd {{ current_path }} && git rev-parse HEAD ignore_errors: true ``` You can see that we do a few things: — generate the release timestamp on server to use on all of them — save the paths release\_path, shared\_path and current\_path, just like Capistrano — handle default values for the migrate, branch, and keep\_releases options — learn the git SHA of the previous release ### git.yml ```yml - name: update source git repo shell: "git fetch && git reset --hard origin/master" sudo: yes sudo_user: "{{ deploy_user }}" args: chdir: "{{ shared_path }}/cached-copy" when: "'{{application}}' in group_names" - name: Create release directory file: "state=directory owner='{{ deploy_user }}' path='{{ release_path }}'" sudo: yes sudo_user: "{{ deploy_user }}" when: "'{{application}}' in group_names" - name: copy the cached git copy shell: "cp -r {{ shared_path }}/cached-copy/. {{ release_path }}" sudo: yes sudo_user: "{{ deploy_user }}" when: "'{{application}}' in group_names" - name: git checkout shell: "git checkout {{ branch }}" sudo: yes sudo_user: "{{ deploy_user }}" args: chdir: "{{ release_path }}" when: "'{{application}}' in group_names" ``` This section ensure that we git-pull the latest code into the cached-copy, copy it into the new release\_directory, and then checkout the proper branch ### links.yml ```yml - name: ensure directories file: "path={{ release_path }}/{{ item }} state=directory" sudo: yes sudo_user: "{{ deploy_user }}" when: "'{{application}}' in group_names" with_items: - tmp - public - name: symlinks shell: "rm -rf {{ item.dest }} && ln -s {{ item.src }} {{ item.dest }}" sudo: yes sudo_user: "{{ deploy_user }}" when: "'{{application}}' in group_names" with_items: - { src: "{{ shared_path }}/log", dest: "{{ release_path }}/log" } - { src: "{{ shared_path }}/pids", dest: "{{ release_path }}/tmp/pids" } - { src: "{{ shared_path }}/pids", dest: "{{ release_path }}/pids" } #Note: Double symlink for node apps - { src: "{{ shared_path }}/sockets", dest: "{{ release_path }}/tmp/sockets", } - { src: "{{ shared_path }}/assets", dest: "{{ release_path }}/public/assets", } - { src: "{{ shared_path }}/system", dest: "{{ release_path }}/public/system", } ``` This creates symlinks from each deployed release back to shared. This enables us to save logs, pids, etc between deploys. ### config.yml ```yml - name: list shared config files shell: "ls -1 {{ shared_path }}/config" register: remote_configs when: "'{{application}}' in group_names" - name: symlink configs shell: "rm -f {{ release_path }}/config/{{ item }} && ln -s {{ shared_path }}/config/{{ item }} {{ release_path }}/config/{{ item }} " with_items: remote_configs.stdout_lines sudo: yes sudo_user: "{{ deploy_user }}" when: "'{{application}}' in group_names" ``` Here we source every file in app/shared/config/\* and symlink it into app/release/config/\* ### bundle.yml ```yml - stat: path={{ release_path }}/Gemfile register: deploy_gemfile_exists - name: bundle install sudo: yes sudo_user: "{{ deploy_user }}" args: chdir: "{{ release_path }}" shell: > bundle install --gemfile {{ release_path }}/Gemfile --path {{ shared_path }}/bundle --without development test --deployment --quiet when: "'{{application}}' in group_names and deploy_gemfile_exists.stat.exists" ``` If there is a Gemfile in this project, we bundle install ### pre\_tasks.yml ```yml - name: deployment pre tasks (all hosts) sudo: yes sudo_user: "{{ deploy_user }}" shell: > cd {{ release_path }} && RAILS_ENV={{ rails_env }} RACK_ENV={{ rails_env }} NODE_ENV={{ rails_env }} {{ item.cmd }} run_once: false when: > ('{{application}}' in group_names) and ({{ item.run_once | default(false) }} == false) and ({{ item.control | default(true) }} != false) with_items: "application_configs[application].pre_deploy_tasks" - name: deployment pre tasks (single hosts) sudo: yes sudo_user: "{{ deploy_user }}" shell: > cd {{ release_path }} && RAILS_ENV={{ rails_env }} RACK_ENV={{ rails_env }} NODE_ENV={{ rails_env }} {{ item.cmd }} run_once: true when: > ('{{application}}' in group_names) and ({{ item.run_once | default(false) }} == true) and ({{ item.control | default(true) }} != false) with_items: "application_configs[application].pre_deploy_tasks" ``` In the application\_configs part of our variable file, we defined a collection of tasks to run as part of the deploy. Here is where asset compilation would be run, etc. Note how when you define the task, we have the attributes "run\_once" and "control", IE: { cmd: "bundle exec rake db:migrate", run\_once: true, control: migrate }. This means that the migration task should only be run on one host, and that it should only be run when the playbook is run with the flags — extra-vars=’migrate=true’. This is how simple it is to build complex Capistrano-like roles. ### reboot.yml ```yml - name: Update current Symlink sudo: yes sudo_user: "{{ deploy_user }}" file: "state=link path={{ current_path }} src={{ release_path }}" notify: - deploy restart unicorn - deploy restart resque when: "'{{application}}' in group_names" - meta: flush_handlers ``` Now that all of our pre-tasks have been run, it’s time to actually change the deploy symlink and "restart" our applications. This simple role just changes the symlink, but the notifications are fairly involved. Some of your servers (Unicorn) may be able to gracefully restart with a simple signal, while others (like resque workers) need to fully stop and start to accept new code. Ansible makes it easy to build notification handlers that fit your needs: ### handlers/main.yml ```yml ## UNICORN ## - name: "deploy restart unicorn" when: "'unicorn' in application_configs[application].roles and '{{application}}:unicorn' in group_names" ignore_errors: yes shell: "kill -s USR2 `cat {{ current_path }}/tmp/pids/unicorn.pid`" sudo: yes sudo_user: "{{ deploy_user }}" notify: - ensure monit monitoring unicorn - name: ensure monit monitoring unicorn monit: name: unicorn-{{ application }} state: monitored sudo: yes ## RESQUE ## - name: deploy restart resque ignore_errors: yes shell: "kill -s QUIT `cat {{ current_path }}/tmp/pids/resque-resque-{{ item.0.name }}-{{ item.1.name }}.pid`" with_subelements: - resque_workers - workers when: "'{{ item.0.name }}:resque' in group_names and item.0.name == application" notify: ensure monit monitoring resque sudo: yes - name: ensure monit monitoring resque monit: name: "resque-{{ item.0.name }}-{{ item.1.name}}" state: monitored with_subelements: - resque_workers - workers when: "'{{ item.0.name }}:resque' in group_names and item.0.name == application" notify: reload monit sudo: yes ``` You can see here that we chain notification handlers here to both restart the application and then ensure that our process monitor, [monit](http://mmonit.com/monit/), is configured to watch that application. ### post\_tasks.yml ```yml - name: deployment post tasks (all hosts) sudo: yes sudo_user: "{{ deploy_user }}" shell: > cd {{ release_path }} && RAILS_ENV={{ rails_env }} RACK_ENV={{ rails_env }} NODE_ENV={{ rails_env }} {{ item.cmd }} run_once: false when: > ('{{application}}' in group_names) and ({{ item.run_once | default(false) }} == false) and ({{ item.control | default(true) }} != false) with_items: "application_configs[application].post_deploy_tasks" - name: deployment post tasks (single hosts) sudo: yes sudo_user: "{{ deploy_user }}" shell: > cd {{ release_path }} && RAILS_ENV={{ rails_env }} RACK_ENV={{ rails_env }} NODE_ENV={{ rails_env }} {{ item.cmd }} run_once: true when: > ('{{application}}' in group_names) and ({{ item.run_once | default(false) }} == true) and ({{ item.control | default(true) }} != false) with_items: "application_configs[application].post_deploy_tasks" ``` post\_tasks are just like pre\_tasks, and allow you to run code after the servers have been restarted. Here is where you might clear caches, update CDNs, etc. ### email.yml Now the fun kicks in. Ansible makes it easy to keep adding more to your playbooks. We wanted to send the development team an email (and also notify hipchat in a similar role) every time a deploy goes out. Here’s a sample: ![](/images/medium-export/0__7DppKQ7yUl87qQJm.png) Here’s how to grab the variables you need: ```yml - name: "capture: sha" run_once: true register: deploy_email_git_sha shell: > cd {{ release_path }} && git rev-parse HEAD - name: "capture: deployer_email" run_once: true register: deploy_email_deployer_email shell: > cd {{ release_path }} && git log -1 --pretty="%ce" - name: "capture: branch" run_once: true register: deploy_email_branch shell: > cd {{ release_path }} && git rev-parse --abbrev-ref HEAD - name: "capture: commit message" run_once: true register: deploy_email_commit_message shell: > cd {{ release_path }} && git log -1 --pretty="%s" - set_fact: previous_revision='n/a' when: previous_revision is defined - name: "capture: previous commits" run_once: true register: deploy_email_previous_commits when: deploy_previous_git_sha is defined and ( deploy_previous_git_sha.stdout_lines | length > 0 ) shell: > cd {{ release_path }} && git log {{ deploy_previous_git_sha.stdout_lines[0] }}..{{ deploy_email_git_sha.stdout_lines[0] }} --pretty=format:%h:%s --graph - name: "capture: human date" run_once: true register: deploy_email_human_date shell: date - name: build the deploy email body run_once: true local_action: template args: src: deploy_email.html.j2 dest: /tmp/deploy_email.html - name: send the deploy email run_once: true when: no_email is not defined or no_email == false local_action: shell sendmail {{ deploy_email_to }} < /tmp/deploy_email.html ``` and our email template is: ```text From: {{ deploy_email_deployer_email.stdout_lines[0] }} Subject: Deployment: {{ application }} [ {{ cluster_env }} ] Content-Type: text/html MIME-Version: 1.0

{{ application }} was deployed to {{ cluster_env }} by {{ deploy_email_deployer_email.stdout_lines[0] }} at {{ deploy_email_human_date.stdout_lines[0] }}

The {{ deploy_email_branch.stdout_lines[0] }} branch was deployed to {{ vars.play_hosts | count }} hosts

The latest commit is: {{ deploy_email_commit_message.stdout_lines[0] }}

Hosts:
    {% for host in vars.play_hosts %}
  • {{ host }}
  • {% endfor %}
{% if deploy_email_previous_commits is defined and deploy_previous_git_sha.stdout_lines | length > 0 %} New on these servers since the last deploy:
{% for line in deploy_email_previous_commits.stdout_lines %} {{ line }}
{% endfor %} {% endif %} ``` And that’s how you build Capistrano within Ansible! You can see how simple it is to translate a complex tool into a few hundred lines of Ansible… with very clear responsibilities and separation. It’s also very easy to extend this to fit your workflow. --- --- url: /blog/post/2018-12-31-repeat-rate.md description: Second-Order Measurements and your Product --- We manage [Voom](https://www.voom.flights) like many modern consumer-facing digital companies: we use Agile, XP, Pair-Programming, and similar tools to focus and prioritize our work. To do any of those successfully, you need to be aligned on your goal… and you need to be able to measure your success towards it. ***Repeat Rate*** is a common metric that companies use to better understand their customers’ affinity towards their product. However, I’ve seen this metric defined in ways that incentive unintended behaviors. ***Repeat Rate*** is different than other common ways to measure your business because, by definition, it is the rate of a rate *changing*, a second-order measurement, and just like in mathematics, there are a lot formulas that look the same when you reduce them too far. Most key business indicators are first-order derivatives, i.e. "thing per time". *Customers Acquired per Week*, or *Conversions per Month* are common [KPIs](https://en.wikipedia.org/wiki/Performance_indicator). However, since ***Repeat Rate*** is a second-order derivative, we must do more to define and understand it. In this article, I’m going to share my belief that the term ***Repeat Rate*** is, at best, likely confusing your team, and at worst, probably the wrong way to measure your business. ![](/images/medium-export/1__XYGVsmFM9b0lnRDHJHfOyQ.jpeg) #### Sample Data Using Voom as an example, I want to propose 3 alternative ways we might measure ***Repeat Rate***. For each of these examples, we are going to use the same dataset, so that we can more closely see how your definition choice can dramatically skew your KPI. *For illustrative purposes, we can simplify things and assume Voom only sells 1 product (a Journey from Downtown to the Airport) and has a fixed price.* **January 2018:** ```text Customers Acquired (2 total) \* Evan \* Christina Journeys Purchased this month (10 total by 2 possible customers) \* Evan: 1 \* Christina: 9 ``` **February 2018:** ```text Customers Acquired (1 total) \* Megan Journeys Purchased this month(3 total by 3 possible customers) \* Christina: 2 \* Evan: 0 \* Megan: 1 ``` #### Repeat Rate Option 1: Cohort Performance over Time This is the "traditional" definition of ***Repeat Rate.*** In this world view, we are looking at the trailing 2-month period after a customer signs up, and checking if they made a repeat purchase or not. **January** * 2 new customers * 1 of the customers within the 2-month window made a repeat purchase in that period * (1 / 2) = **50%** **February** * 1 new customers * 0 of the new customers from within the 2-month window made a purchase in that period (so far) * (0 / 1) = **0%** The overall trend is decreasing. #### **Repeat Rate Option 2: Repeating Conversions over Time** In this world view, we look at how many conversions happened each period, and then which percentage of them were the customer’s first purchase or a repeat purchase. **January** * 2 first-time purchases & 8 repeat purchases * 10 total purchases * (8 / 10) = **80%** **February** * 1 first-time purchases & 2 repeat purchases * 3 total purchases * (1 / 3) = **33%** The overall trend is decreasing. #### Repeat Rate Option 3: Repeating Customers over Time In this world view, we look at how many unique customers made a purchase in each period, and then what percentage of them were making a first or repeat purchase. In the event where a customer both made their first and subsequent purchases in the same period, we will count them as *both* a new and repeating customer.\* **January** * 2 first-time purchasers and 1 repeating purchaser * 3 total customers making a purchase\* * (1 / 3) = **33%** **February** * 1 first-time purchasers and 1 repeating purchaser * 2 total customers making a purchase\* * (1 / 2) = **50%** The overall trend is increasing. #### Analysis ![](/images/medium-export/1__q63k__ikuN8__6ob1hrVc3Yw.png) As you can see, using the same data, we’ve got 3 distinct ways to calculate ***Repeat Rate***, each of which produces different results! What’s more troubling is that 2 of our methods produce decreasing trends, while one produces an increasing trend. But which one is right for you? **Repeat Rate Option 1: Cohort Performance over Time** *(aka "Repeat Rate")***:** This would be a measure of the effectiveness of your ability to acquire the proper type of customer who is immediately ready to purchase your product again. **Repeat Rate Option 2: Repeating Conversions over Time:** This would be a measure of the total sales of your business, and how many of them go to repeat customers vs. new customers. **Repeat Rate Option 3: Repeating Customers over Time:** This would be a measure of your customers, and how many of them go on to make that "n+" conversion. Unless you are explicitly measuring a change in your Marketing/PR/Top-of-funnel approach, I think that option one, the traditional measure way of calculating ***Repeat Rate***, is likely not what you want. Of course, whether you choose to measure ***Repeating Customers per Month*** or ***Repeating Conversions per Month*** is up to you… but don’t just call it Repeat Rate! --- --- url: /blog/post/2013-10-28-ruby-osx-mavericks.md description: >- Are you developing in Ruby and have just upgraded your OSX Machine to Mavericks (OSX 10.9)? Are you suddenly having trouble installing or… --- Are you developing in Ruby and have just upgraded your OSX Machine to Mavericks (OSX 10.9)? Are you suddenly having trouble installing or compiling gems? I’ve seen a few articles on the topic, but none of them really worked for me. Here’s what did: * ensure you have the newest version of xCode installed from the app store (it’s still free) * force-update the developer tools \`xcode-select — install\` * re-install the version of ruby you using * If you are using [rbenv](https://github.com/sstephenson/rbenv), it’s as simple as \`rbenv install\` That’s it! No need for messing with your symlinks or system libs. --- --- url: /blog/post/2016-09-22-ruby-homebrew-and-osx-sierra.md description: Say you are a developer and you recently updated to OSX Sierra… --- Say you are a developer and you recently updated to OSX Sierra… ![](/images/medium-export/1__nG5K39SYOMiqX9SHwn__3lg.jpeg) You may encounter an error like this when doing anything with ruby, even booting up *IRB*: ```ruby undefined method `default_gems_use_full_paths?' ``` If you are a [HomeBrew](http://brew.sh/) user, this will cause everything to break :( After much googling, this error is the result of you having installed a newer version of ruby-gems (or a patch) against your ***system*** version of ruby. There is an easy fix! Remove any installs of ruby-gems or any ruby-gems patches against the system version of ruby; ie: remove/rename everything in `/Library/Ruby/Gems/2.0.0` and try again. You shouldn’t have any other gems in there (other than perhaps a runner like *rake*), so you shouldn’t have any gems to re-install. You *should* be using a ruby version manager like [rbenv](https://github.com/rbenv/rbenv) to manage all of your gems anyway :D --- --- url: /blog/post/2021-08-16-node-js-and-ipv6.md description: Grouparoo can speak IPv4 and IPv6 - Here's how we did it. --- We want to make Grouparoo as easy as possible to run, which means considering many different server environments. We recently had a customer who wanted to run Grouparoo in a Docker cluster that only had IPv6 addresses enabled. There are lots of reasons why IPv6 might be better (including the fact that we are [running out of public IPv4 Addresses](https://en.wikipedia.org/wiki/IPv4_address_exhaustion)), but it’s rare to find a deployment environment that *only* has IPv6 addresses by default. That said, it’s easy to tell your Node.js application to listen to all hosts on both IPv4 and IPv6 - and that's what Grouparoo does [now](https://github.com/grouparoo/grouparoo/pull/2127)! ![Twisty Roads](/images/posts/2021-08-16-node-js-and-ipv6/sylvain-gllm-X4dBqRUzO2U-unsplash.jpeg) ## The Node.js HTTP Server When starting a Node.js server, you can to choose both the port to listen on and the hostname to bind to ([docs](https://nodejs.org/api/net.html#net_class_net_server)). Depending on how your server is configured, choosing a specific hostname might route traffic in different ways. Maybe you have 2 network cards (one for internal traffic and one for external), or perhaps you have different networks for IPv4 and IPv6 traffic - choosing a certain hostname may have different effects. The node HTTP example looks like this: ```ts // from https://nodejs.org/en/docs/guides/getting-started-guide/ const http = require("http"); const hostname = "127.0.0.1"; const port = 3000; const server = http.createServer((req, res) => { res.statusCode = 200; res.setHeader("Content-Type", "text/plain"); res.end("Hello World"); }); server.listen(port, hostname, () => { console.log(`Server running at http://${hostname}:${port}/`); }); ``` In this example, we are binding only to `127.0.0.1` which is the IPv4 version of what is called a `loopback` - this means that only the same computer can talk to itself. This is a very safe way to test and develop, and a very bad way to run a web server 🤣. Conversely, what if we wanted to allow traffic in from the widest variety of hosts? This is the default configuration of Grouparoo - we want the application to be as widely available as possible, and for the infrastructure to be be in charge of restricting who the application can talk to. In IPv4, that would mean choosing a host of `0.0.0.0` which would allow traffic from anywhere. What about IPv6? It turns out that the string `::` (yes, that’s two colons) means "everywhere" in IPv6, and is shorthand for `0.0.0.0.0.0.0.0`. So, in our node.js example above, that would mean that the most permissive host options would be: ```ts const hostname = "::"; const port = 3000; //... server.listen(port, hostname); ``` ## Testing How can we test that both IPv4 and IPv6 clients can reach your application? Grouparoo exposes a public "status" endpoint we can use to make sure that the application is reachable, and we can try to connect via `cURL` over a few hostnames & IP Addresses. Then, we pipe the response through the [`jq`](https://stedolan.github.io/jq/) command to parse out just the "status" response key: #### IPv4 testing: ```bash $ curl -s "http://localhost:3000/api/v1/status/public" | jq .status "ok" $ curl -s "http://127.0.0.1:3000/api/v1/status/public" | jq .status "ok" $ curl -s "http://0.0.0.0:3000/api/v1/status/public" | jq .status "ok" ``` #### IPv6 testing: ```bash $ curl -s "http://[::1]:3000/api/v1/status/public" | jq .status "ok" $ curl -s "http://[::]:3000/api/v1/status/public" | jq .status "ok" $ curl -s "http://[::ffff:127.0.0.1]:3000/api/v1/status/public" | jq .status "ok" $ curl -s "http://[0:0:0:0:0:0:0:1]:3000/api/v1/status/public" | jq .status "ok" $ curl -s "http://[0:0:0:0:0:0:0:0]:3000/api/v1/status/public" | jq .status "ok" ``` You can see that for both IPv4 connections (eg: `127.0.0.1`) and IPv6 connections (eg `0:0:0:0:0:0:0:1`) we can connect to our app! ## IPv6 Also Support IPv4 Addresses The hostname of `::` works with IPv4 addresses because it is backwards compatible. Technically, we have only bound to an IPv6 address, but IPv6 can still handle the older style of connections. This is visible in Grouparoo's logs when we try `127.0.0.1`: ``` 2021-08-05T18:03:53.046Z - info: [ action @ web ] to=::ffff:127.0.0.1 action=status:public params={"action":"status:public","apiVersion":"1"} duration=3 method=GET pathname=/api/v1/status/public ``` The address `127.0.0.1` is translated to `::ffff:127.0.0.1:` which is the IPV6 interpretation of `127.0.0.1`, with the 4 missing first IPv6 sections replaced by `f`. *** Thanks to [Stack Overflow](https://stackoverflow.com/questions/40189084/what-is-ipv6-for-localhost-and-0-0-0-0) for some of this information. --- --- url: /blog/post/2012-01-14-running-node-on-a-phidget-board.md description: >- My last post was about creating a generic npm package to talk to a phidget board. This post is about compiling nodeJS to run on a phidget board --- ![](/images/medium-export/1__QYAVA14BeJgLpD7ajTlCWQ.png) My last post was about creating a generic npm package to talk to a phidget board. This post is about compiling nodeJS to run -on- a phidget board. There is a special class of Phidget boards which are actually small ARM microcomputers. Rather than the "normal" method of connecting to phidgets via USB, these boards embrace the network and offer local storage, network I/O and USB ports. They run a slimmed down version of Debian called EmDebian. I’m currently obsessed with getting Node to run on this little computer. Having a first-class web programing stack on an embedded device with sensors is my dream prototyping environment. There are a lot of hardware limitations to doing so (the least of which is only having 50MB of ram), but I am getting close! I’ve entered the dark world of cross-compiling, and I would love any help you might be able to offer. [See the Github Gist Here](https://gist.github.com/evantahler/1574158) --- --- url: /blog/post/2012-02-17-sf-from-a-yinzers-pov.md description: >- I have a hard time answering the question "where are you from?". I’ve lived in over 15 different homes in 8 cities. I was born in New… --- ![](/images/medium-export/1__ocGA4k76Nn9lgWEESMShsQ.jpeg) I have a hard time answering the question "where are you from?". I’ve lived in over 15 different homes in 8 cities. I was born in New Jersey, went to high school in Pennsylvania, and currently live in California. I have decided that for me, the question "where are you from" is equivalent to "where have you lived the longest?". For me, that’s Pittsburgh PA. There’s more to the question "Where are you from?" than duration of occupation. It implies that the place you are from has had some meaningful impact on your way of thinking and (potentially) upbringing. I’ve noticed that Pittsburgh also fits the bill for that with me as well (and its deeper than loving Perogi). Without getting too philosophical here, I’ve noticed that I tend to use Pittsburgh as my "measuring stick" when comparing cities. I do this so much so, that I have started to think of San Francisco neighborhoods as their Pittsburgh equivalents. It turns out that there are a lot of people in San Francisco that have passed though Pittsburgh, and liked this notion. I am proud to present to you version 1 of ### ["San Francisco neighborhoods as Pittsburgh neighborhoods](http://bit.ly/y5YUwj)" ### or ### ["How a Yinzer sees San Francisco](http://bit.ly/y5YUwj)" [Kate](http://katealamode.com), [Andy](https://twitter.com/#!/andyjih), and [Edmundo](http://www.edmundito.com/) helped me out with this one. [Yinzer is Pittsburghese](https://en.wiktionary.org/wiki/yinzer) for "Person from Pittsburgh" --- --- url: /blog/post/2020-09-14-save-your-high-water-marks-as-strings.md description: >- Asking the database to return the High Water Mark as a string prevents a number of bugs. --- In Brian’s post, [Building a Sync Engine](/blog/building-a-sync-engine), he talks about the value of using a **High Water Mark** to keep track of the latest bit of data you’ve imported. This approach is often a better pattern than using `Limit` and `Offset`, especially when the underlying data might be changing. In this post, I’m gong to dive even deeper into this topic, and suggest that you should be storing you High Water Marks as strings whenever possible. ## The Hidden Problem Consider the following query: ```sql SELECT * FROM USERS WHERE UPDATED_AT >= '2020-08-27 12:00:00' ORDER BY updated_at ASC LIMIT 10 ``` Here, we are asking for the next 10 users who have been updated since noon on August 27th. This query is a good implementation of using a High Water Mark to remember the `updated_at` timestamp of the last User we saw and to get the next batch. In this example, the previous value of our High Water Mark was `2020-08-27 12:00:00`. There are a number of scenarios in which `2020-08-27 12:00:00` might actually not be the correct *string representation* of the High Water Mark. The types of bugs to watch out for fall into 2 main categories: `stringification` and `resolution`. ### Stringification Bugs The `stringification` camp of bugs has to do with converting a "date" or "time" object into a string. We are required to use strings when writing SQL queries, so at some point, either you or your [ORM](https://en.wikipedia.org/wiki/Object-relational_mapping) will need to convert an Object to a String. In just Javascript there are many ways to do this: `new Date().toString()`, `new Date().getTime().toString()`, `new Date().toISOString()`, etc - all of which will generate different strings. More insidiously, there are other issues hidden in the `stringification` category - those revolving around Timezones and clock drift. When your code builds the `Date` object from response from your database. Which timezone will it be using - the Timezone of your Database or the Timezone of your Application? Do you know if the database is returning timestamps in *its* timezone or a more global representation of time (ie: `Timestamp with Timezone` in Postgres). Are the results the same in Staging vs Production... and does your ORM know the difference? ### Resolution Bugs The `resolution` class of problems is less dangerous than the `stringification` problems, but it can result in duplicated reads and therefore slower imports. Consider these rows in Postgres: ![Getting your API Key and Secret](/images/posts/2020-09-14-save-your-high-water-marks-as-strings/database.png) We’ve got values of `2020-07-25 12:18:56.831` for `updated_at`– that’s precision down to fractions of a second! However, that data is lost when the [ `pg`](https://node-postgres.com/) package reads that row and casts it to a `new Date()`. When we eventually build a string out of it to make our next query, we only get `2020-07-25 12:18:56` back. If you follow the advice in our previous post to always compare with equality (`=>`) you won’t skip any rows, but you’ll read the same row back again each time. ## The Solution So what’s the solution here? Knowing that we will need to convert our High Water Mark to and from a string type, **we should ask the database to do the string conversion for us**. This approach is called "casting" - converting data from one type format to another, and the Database is the best place to do it. Casting the High Water Mark to a string at the database ensures: * The string representation of the High Water Mark is in a format the database can accept. * The string representation of the High Water Mark is in the timezone the database is already using/assuming. * The string representation of the High Water Mark is represented with the maximum accuracy the database can use. This turns our example query into the following: ```sql # Postgres SELECT *, updated_at::text as __hwm FROM USERS WHERE UPDATED_AT >= '2020-08-27 12:00:00' ORDER BY updated_at ASC LIMIT 10 # MySQL SELECT *, CAST(updated_at as CHAR) as __hwm FROM USERS WHERE UPDATED_AT >= '2020-08-27 12:00:00' ORDER BY updated_at ASC LIMIT 10 ``` We ask the database both for all the data about the rows we are selecting, and we ask for `updated_at` to be *cast* as a string for us, returned as `__hwm`. We can now use `__hwm` directly in subsequent queries without any of the problems listed above. --- --- url: /blog/post/2017-03-23-scoreboard-guru-initial-release.md description: I’m happy to announce the initial release of Scoreboard Guru! --- I’m happy to announce the initial release of [**Scoreboard Guru**](https://www.scoreboard.guru)! ![](/images/medium-export/1__W__ZjnVPX889X__uzE9GXMmg.png) ### What is Scoreboard Guru? > Scoreboard Guru is the best way to keep track of your scores in all your games & sports! > You can see your performance over time, compare with your friends, and export your scores. Scoreboard Guru is easy to use, and backs up all of your data on our servers so all your history is safe & available on all your devices. I’m a nerd, and I play a lot of board games. I’ve always wondered how I compared to the rest of my friends and family. I built Scoreboard Guru to not only keep track of my scores over time, but also to allow friends to compare their performance with each-other and see how they stack up. ### Where did Scoreboard Guru come from? There are a /number/ of score-keeping apps out there, but none of them had all the features I wanted. Scoreboard Guru’s unique features include: * Cloud backup & sync so you can view your scores across multiple devices, and, if a friend is tracking this game, your scores will still be counted. * Geo-tagging, location-tagging, and image upload. You can remember the name of that game you played at Sally’s house a few months back! * Export & Sharing. While all your scores are stored on our servers, you can always share your matches online, and export your data as a CSV. * Offline works too! There’s nothing worse than not being able to use an app just because your cell connection stops working. With Scoreboard Guru, you can store (many) in-progress games on your phone, and then upload them when you get back online. ### How can I get Scoreboard Guru? I wanted to make Scoreboard Guru simple to use, and accessible. That’s why I’m happy to report that Scoreboard Guru is free to try. Just download the app and sign up, and you will be able to create up to 3 games and matches. After that, it’s only $2 to unlock the app… on all your devices! Head on over to [www.scoreboard.guru](http://www.scoreboard.guru) (yes, .guru is a real top-level domain name), and give it a shot! --- --- url: /blog/post/2020-07-23-nextjs-plugins.md description: >- How does Grouparoo use Next.js to load pages and components from plugins to modify our web user interface? --- ![computer and fern](/images/posts/2020-07-23-nextjs-plugins/computer-and-fern.jpeg) At Grouparoo, our front-end website is built using [React](https://reactjs.org/) and [Next.js.](https://nextjs.org/) Next.js is an excellent tool made by [Vercel](https://vercel.com/) that handles all the hard parts of making a React app for you - Routing, Server-side Rendering, Page Hydration and more. It includes a simple starting place to build your routes and pages, based on the file system. If you want a `/about` page, just make an `/pages/about.tsx` file! The Grouparoo ecosystem contains many ways to extend the main Grouparoo application through plugins. Part of what Grouparoo plugins can do is add new pages to the UI, or add new components to existing pages. We use Next.js to build our front-end... which is very opinionated in its default settings to only work with "local" files and pages. How then can we use Next.js to load pages and components from other locations like plugins? In this post, we’ll talk about how to load additional components and pages from a sub-project, like a [lerna](https://github.com/lerna/lerna) monorepo, or a package released to NPM. ::: tip To see the project described in this blog post, please visit the [github.com/grouparoo/next-plugins-example](https://github.com/grouparoo/next-plugins-example) repository. ::: ## Setting up the Project We have a monorepo, which we will be using Lerna to manage. We have a `server` project which is our main application and `plugins` which contain plugins the `server` can use. The plugin, `my-nextjs-plugin` contains a page, `/pages/hello.tsx`, which we want the main application to display. [See the repository here](https://github.com/grouparoo/next-plugins-example). ![A screenshot of the Github Repo](/images/posts/2020-07-23-nextjs-plugins/repo-screenshot.png) Our `learna.json` looks like this: ```json // lerna.json { "packages": ["plugins/*", "server"], "version": "0.0.1" } ``` Our top-level `package.json` contains only `lerna` and some scripts that allow us to run `lerna bootstrap` as part of the top-level install process and helpers to run `dev` and `start` for us in the main `server` project. ```json // package.json { "name": "next-plugins", "version": "0.0.1", "description": "An example of how to use a dynamic import to load a page from a random plugin outside of the main next \"pages\" directory", "private": true, "dependencies": { "lerna": "^3.22.1" }, "scripts": { "start": "cd server && npm run start", "dev": "cd server && npm run dev", "test": "cd server && npm run build", "prepare": "lerna bootstrap --strict" } } ``` This configuration means that when you type `npm install` at the top-level of this project, the following will happen: 1. Lerna will be installed 2. `lerna bootstrap` will be run, which in turn: 1. Runs `npm install` in each child project (`server` and `plugins`) 2. Ensures that we symlink local versions of the `plugins` into the `server` project. 3. Runs the `npm prepare` lifecycle hooks for each sub-project, which means we can `next build` automatically as part of the install process. Our `package.json` file for the server can look like: ```json // server/package.json { "name": "next-plugins-server", "version": "0.0.1", "description": "I am the server!", "license": "ISC", "private": true, "dependencies": { "my-nextjs-plugin": "0.0.1", "next": "^9.3.2", "react": "^16.13.1", "react-dom": "^16.13.1", "fs-extra": "^9.0.1", "glob": "^7.1.6" }, "scripts": { "dev": "next", "build": "next build", "start": "next start", "prepare": "npm run build" }, "devDependencies": { "@types/node": "^13.7.1", "@types/react": "^16.9.19", "typescript": "^3.7.5" } } ``` And the `pacakge.json` from the plugin can look like: ```json // plugins/my-nextjs-plugin/package.json { "name": "my-nextjs-plugin", "version": "0.0.1", "description": "I am the plugin!", "main": "index.js", "private": true, "license": "ISC", "dependencies": { "react": "^16.13.1", "react-dom": "^16.13.1" } } ``` Now that the applications are set up, we can add some pages into the `server/pages` directory and confirm that everything is working by running `npm run dev`. ## Dynamic pages in Next.js Next.js has a cool feature that allows you to use files names\d `[my-variable].tsx` to indicate a wildcard page route. You can then get the value of `my-variable` in your React components. This feature allows us to make a page that handles all the routes we might want to use for our plugins, in this case `pages/plugins/[plugin]/[page].tsx`. The page itself doesn’t do much except for handle the routing, which you can see here: ```tsx // server/pages/plugins/[plugin]/[page].tsx import dynamic from "next/dynamic"; import { useRouter } from "next/router"; import Link from "next/link"; export default function PluginContainerPage() { const router = useRouter(); // The Next router might not be ready yet... if (!router?.query?.plugin) return null; if (!router?.query?.page) return null; // dynamically load the component const PluginComponent = dynamic( () => import( `./../../../../plugins/${router.query.plugin}/pages/${router.query.page}` ), { loading: () =>

Loading...

, }, ); return ( <> Back
); } ``` This configuration is how our `hello` page from the plugin could be loaded by the route `/plugins/my-nextjs-plugin/hello` in the `server` application! ## Hacking the Next.js Webpack configuration Our next step is to extend the Webpack configuration that Next.js provides and use it in our plugins. Next.js comes with all the required tools and configuration for Webpack and Babel to transpile Typescript and TSX (and JSX) pages on the fly... but our plugin doesn’t have access to that because by default, Next.js only includes files within *this* project for compilation. In `next.config.js` we can extend the Webpack configuration that ships with Next.js to include our plugin: ```js // server/next.config.js module.exports = { webpack: (config, options) => { config.module.rules.push({ test: /plugins\/.*\.ts?|plugins\/.*\.tsx?/, use: [options.defaultLoaders.babel], }); return config; }, }; ``` Without this extra Webpack rule, you’ll see compilation or parse errors as the plugins TSX/JSX will not be compiled into browser-usable javascript. ## Webpack Loading Shims The final piece of the puzzle is give Webpack some help to know where to look for our plugin files. In our `pages/plugins/[plugin]/[page].tsx`, we gave Webpack a pretty big area of the filesystem to search with the `import(./../../../../plugins/${router.query.plugin}/pages/${router.query.page})` directive. Under the hood, Webpack is looking for all possible files which might match this pattern, in any combination. This search pattern includes cases when one of those paths might be `..`, which may end up scanning a large swath of your filesystem. This approach can be very slow if you have a big project, and lead to out-of-memory errors. Even without crashing, it will make your plugin pages slow to load. To fix these issues, rather than using wildcards, we can statically reference only the files we’ll need by building “shim” loaders as part of our boot process. We can add `require('./plugins.js')` to `next.config.js` to make sure that this process happens at boot. What `plugins.js` does is that it loops through all the pages in our plugins and creates a shim in `tmp/plugins` for every file we might want to import. ```js // server/plugins.js const fs = require("fs-extra"); const path = require("path"); const glob = require("glob"); // prepare the paths we'll be using and start clean if (fs.existsSync(path.join(__dirname, "tmp"))) { fs.rmdirSync(path.join(__dirname, "tmp"), { recursive: true }); } fs.mkdirpSync(path.join(__dirname, "tmp")); // the top-level folder needs to exist for webpack to scan, even if there are no plugins fs.mkdirpSync(path.join(__dirname, "tmp", "plugin")); // For every plugin provided, we need to make an file within the core project that has a direct import for it. // We do not want to use wildcard strings in the import statement to save webpack from scanning all of our directories. const plugins = glob.sync(path.join(__dirname, "..", "plugins", "*")); plugins.map((plugin) => { const pluginName = plugin .replace(path.join(__dirname, "..", "plugins"), "") .replace(/\//g, ""); fs.mkdirpSync(path.join(__dirname, "tmp", "plugin", pluginName)); const pluginPages = glob.sync(path.join(plugin, "pages", "*")); pluginPages.map((page) => { const pageName = page .replace(path.join(__dirname, "..", "plugins", pluginName, "pages"), "") .replace(/\//g, ""); fs.writeFileSync( path.join(__dirname, "tmp", "plugin", pluginName, `${pageName}`), `export { default } from "${page.replace(/\.tsx$/, "")}" console.info("[Plugin] '${pageName}' from ${pluginName}");`, ); }); }); ``` For example, the shim for `hello.tsx` in our plugin looks like: ```tsx // generated into server/tmp/plugin/my-nextjs-plugin/pages/hello.tsx export { default } from "/Users/evan/workspace/next-plugins/plugins/my-nextjs-plugin/pages/hello"; console.info("[Plugin] 'hello.tsx' from my-nextjs-plugin"); ``` This shim does a few things for us: 1. Since this plugin is now within the main `server` project, Next.js and Webpack will pre-compile and watch this file for us 2. We can change our dynamic import statement in `pages/plugins/[plugin]/[page].tsx` to reference our shim rather than the file outside of the project. This keeps webpack much faster. The updated version of `pages/plugins/[plugin]/[page].tsx` is now: ```tsx // server/pages/plugins/[plugin]/[page].tsx import dynamic from "next/dynamic"; import { useRouter } from "next/router"; import Link from "next/link"; export default function PluginContainerPage() { const router = useRouter(); // The Next router might not be ready yet... if (!router?.query?.plugin) return null; if (!router?.query?.page) return null; // dynamically load the component const PluginComponent = dynamic( () => import( `./../../../tmp/plugin/${router.query.plugin}/${router.query.page}` ), { loading: () =>

Loading...

, }, ); return ( <> Back
); } ``` And you’ll get a nice note in the console too! ![The plugin loads and shows a note](/images/posts/2020-07-23-nextjs-plugins/console-note.png) ## Packages released via NPM You can now include React pages and components from plugins into your Next.js application. The methods outlined here will work for both Next’s development mode (`next dev`), and compiled “production” mode with `next build && next start`). These techniques will also work for packages you install from NPM, but you’ll need to adjust some of the paths when building your shims. Assuming your NPM packages only contain your not-yet-compiled code (TSX, TS, or JSX files), we will need to make one final adjustment. By default, the Next.js Webpack plugin does not compile files found within `node_modules`, so we’ll need to override that behavior too. That makes our final `next.config.js`: ```js // sever/next.config.js const glob = require("glob"); const path = require("path"); const pluginNames = glob .sync(path.join(__dirname, "..", "plugins", "*")) .map((plugin) => plugin.replace(path.join(__dirname, "..", "plugins"), "")) .map((plugin) => plugin.replace(/\//g, "")); require("./plugins"); // prepare plugins module.exports = { webpack: (config, options) => { // allow compilation of our plugins when we load them from NPM const rule = config.module.rules[0]; const originalExcludeMethod = rule.exclude; config.module.rules[0].exclude = (moduleName, ...otherArgs) => { // we want to explicitly allow our plugins for (const i in pluginNames) { if (moduleName.indexOf(`node_modules/${pluginNames[i]}`) >= 0) { return false; } } // otherwise, use the original rule return originalExcludeMethod(moduleName, ...otherArgs); }; // add a rule to compile our plugins from within the monorepo config.module.rules.push({ test: /plugins\/.*\.ts?|plugins\/.*.tsx?/, use: [options.defaultLoaders.babel], }); // we want to ensure that the server project's version of react is used in all cases config.resolve.alias["react"] = path.join( __dirname, "node_modules", "react", ); config.resolve.alias["react-dom"] = path.resolve( __dirname, "node_modules", "react-dom", ); return config; }, }; ``` Note that we’ve also added a `config.resolve.alias` section telling Webpack that any time it sees `react` or `react-dom`, we should always use the version from `server`’s package.json. This alias will help you to avoid problems with multiple versions or instances of React." --- --- url: /blog/post/2012-02-16-small-update-to-node-phidgets.md description: >- After all the hoopla about node.js and Nerf guns yesterday I revisited the phidgets package and fixed a few bugs. Most importantly there… --- After all the hoopla about node.js and Nerf guns yesterday I revisited the phidgets package and fixed a few bugs. Most importantly there should no longer be any ‘skipped’ data lines like there were before. As before, you can either npm install phidgets or get the code [here](https://github.com/evantahler/nodePhidgets) [**evantahler/nodePhidgets**](https://github.com/evantahler/nodePhidgets) --- --- url: /blog/post/2021-03-04-sql-dialect-differences.md description: >- Grouparoo works with both SQLite and Postgres databases. This post shares what we've learned about the differences. --- Like many applications, Grouparoo stores data in a relational database. Unlike most applications, Grouparoo works with 2 different types of databases - Postgres and SQLite. We enable our customers to run Grouparoo in a number of different ways - on their laptop with no external decencies, and as part of a large cluster with many servers processing data in parallel. When running Grouparoo locally, you can use SQLite so no other dependencies are needed, and in the production cluster, you can use a hosted version of Postgres provided by your hosting provider. ![Grouparoo likes SQLite and Postgres](/images/posts/2021-03-04-sql-dialect-differences/210303-databases.png) Grouparoo uses the [Sequelize](https://sequelize.org/) Object Relational Mapper, or `ORM`, along with [sequelize-typescript](https://github.com/RobinBuschmann/sequelize-typescript) so we can work with the same Objects in our codebase, regardless of the database providing persistence. Sequelize does a great job of abstracting away the differences between the database types... most of the time. In this blog post, I’ll be sharing the times when the differences in the SQL implementations of Postgres and SQLite matter. ## Case Insensitive Sting Comparisons Postgres supports both the `like` and `iLike` operators for comparing strings, with the `i` indicating case-insensitive matching ([Postgres Docs](https://www.postgresql.org/docs/12/functions-matching.html)). That means you can choose, per query, if you are ignoring case or not: ```sql -- Postgres -- -- assuming you have `email = person@example.com` (lowercase) in your `users` table -- match SELECT * FROM users WHERE email ILIKE '%@EXAMPLE.COM'; -- no match SELECT * FROM users WHERE email LIKE '%@EXAMPLE.COM'; ``` However, in SQLite, all string comparisons are case-insensitive (and there is no `iLike` function ([SQLite Docs](https://sqlite.org/lang_expr.html)). Instead, if you really want your `like` function to be made case-sensitive, you would use the `case_sensitive_like` PRAGMA ([SQLite Docs](https://sqlite.org/pragma.html#pragma_case_sensitive_like))... but that’s a database-wide change that you likely don’t want to use. ```sql -- SQLite -- -- assuming you have `email = person@example.com` (lowercase) in your `users` table -- match SELECT * FROM users WHERE email LIKE '%@EXAMPLE.COM'; -- no match PRAGMA case_sensitive_like=ON; SELECT * FROM users WHERE email LIKE '%@EXAMPLE.COM'; ``` In the Grouparoo application, this distinction shows up in a number of places, with the most interesting being that we need to provide different rules that can be used to calculate Group membership. If you visit [the groups config page](/docs/support/config-files#groups) and check out the options for string or email comparisons between Postgres and SQLite, you’ll see the difference. ## Date and Time Part Functions Postgres ships with a number of handy date and time functions with a consistent API, like `date_trunc`. ([Postgres Docs](https://www.postgresql.org/docs/9.1/functions-datetime.html)) SQLite instead chose to rely on the C-like `strftime` function ([SQLite Docs](https://sqlite.org/lang_datefunc.html)). Both are popular ways to deal with time, but very different ways of approaches. For example, if we want to count up how many events occurred per hour: ```SQL -- Postgres --- SELECT COUNT(*) as total, date_trunc('hour', "occurredAt") as time FROM events GROUP BY 2 -- SQLite --- SELECT COUNT(*) as total, strftime('%Y-%m-%d %H:00:00', "occurredAt") as time FROM events GROUP BY 2 ``` While not necessarily a user-facing problem, there are quite a few places in the Grouparoo codebase were we calculate rollups like these, and need to make different queries depending on the database in use. ## Min and Max typecasting Sequelize helps you to write rather complex queries in a database-agnostic way. Consider the following query that asks for all the types of events that exist, and returns the count, first occurrence and most recent occurrence. e.g.: we might learn that there have been 100 `pageview` events, with the first one on Jan 1 and the most recent one today. This Sequelize query works for both Postgres and SQLite! ```js const types = await Event.findAll({ attributes: [ "type", [api.sequelize.fn("COUNT", "id"), "count"], [api.sequelize.fn("MIN", api.sequelize.col("occurredAt")), "min"], [api.sequelize.fn("MAX", api.sequelize.col("occurredAt")), "max"], ], group: ["type"], order: [[api.sequelize.literal("count"), "desc"]], }); ``` However, the resulting objects differ slightly `types[0].min` will be a JS `Date` object from Postgres and a `string` from SQLite. They will need to be converted to the same type in your application code. ## Boolean Column typecasting [SQLite does not have Boolean columns](https://www.sqlite.org/datatype3.html), and uses integers instead. When using an ORM that supports the boolean type, *most* of the time it knows to covert the database’s `1` to `true` and `0` to `false`, but when accessing properties directly it may not. This appears regularly with Sequelize’s `instance.getDataValue()` method. Conversely, Postgres boolean values are always properly cast. ## Transaction Limits SQLite can only handle one transaction at a time. This makes sense, as it’s quite literally reading and writing a file on disk. Postgres, on the other hand, can handle many transactions at once and does a great job of merging the results and avoiding deadlocks. If you using Node.JS like Grouparoo is, even a single process can generate many transactions - you might be processing many API requests in parallel, or in the case of Grouparoo, running many background tasks at once. To help avoid SQLite deadlocks (which look like `SequelizeTimeoutError: SQLITE_BUSY: database is locked`), we limit the number of workers we run against a SQLite database to 1. ## Compound Indexes with Unique Columns Sequelize has a [bug](https://github.com/sequelize/sequelize/issues/12823) in which a migration against a table that has an index against 2 columns will make those columns unique, even if they wen’t before the migration. To mitigate this, we do not use compound indexes in the Grouparoo application. *** While this list may seem long, the vast majority of the Grouparoo codebase works exactly the same regardless of if you are backing the application with SQLite or Postgres. The Sequelize team did a great job abstracting most of the dialect nuances away. --- --- url: /blog/post/2014-08-27-statusbot.md description: Keep track of your uptime with Statuspage.io and Node.js --- With the [New TaskRabbit](http://blog.taskrabbit.com/2014/07/10/the-new-taskrabbit-is-here-with-new-ios-android-apps-for-clients-and-1m-insurance-policy-on-every-task/), we rolled out [status.taskrabbit.com](http://status.taskrabbit.com) to help notify our users & partners when something goes wrong with [www.taskrabbit.com](https://www.taskrabbit.com) We selected [statuspage.io](http://statuspage.io) to power this status page because they have a simple API, and they handle logs, subscriptions, and notifications for us. ![](/images/medium-export/0__DrCJJqCWSpAOEpNu.png) To send data to statuspage.io, we built [**StatusBot** (github)](https://github.com/taskrabbit/statusbot) to poll our public pages and APIs, and then pass along that information to the public status page. [**taskrabbit/statusbot** \_statusbot - Monitor your uptime automagically with statuspage.io\_github.com](https://github.com/taskrabbit/statusbot "https://github.com/taskrabbit/statusbot")[](https://github.com/taskrabbit/statusbot) We already had a complex health check which we use internally to check on the status of the app/marketplace, so there wasn’t much extra development within our core ruby/node apps to support this. The change we implemented was to split our health check up into separate endpoints, each of which tested a single subsystem. For example, we have one unique health check to monitor our Resque queue length, and another one to monitor our elasticsearch cluster’s health. With each health check, we monitor 3 things: connectivity, response time, and HTTP status code. For us, we also realized that each METRIC we check also rolls up to a COMPONENT. These are terms that statuspage.io uses. A COMPONENT is a top level consumer-facing element (like "Website" or "API"). COMPONENTs have status like "up" or "down", and can have incidents, like downtime or a scheduled outage. METRICs on the other hand are measured, like an API’s response time. For us, the METRICs of "API Health" ("Resque Length", "ElasticSeach Health", etc) all roll up to the single COMPONENT "API". We built StatusBot with this in mind. StatusBot lets you configure thresholds for each check, meaning that I can expect the Resque check to respond quickly (under 50ms) while a more complex check (checking that every posted job has been seen by a rabbit within 1 hour) which can take up to 10 seconds. This is the check.threshold option. I can also flag a check as having no impact, meaning that if the service goes down (IE: this blog is down), we should note it and fix it, but it doesn’t really count towards our downtime. Start monitoring your own sites with [**StatusBot** (github)](https://github.com/taskrabbit/statusbot) --- --- url: /blog/post/2016-01-09-switchboard-chat.md description: 'Today I want to announce the beta of a new website I built, switchboard.chat' --- *…yes, \*.chat is a TLD now.* ![](/images/medium-export/1__k7TOA8rMJW3wFmqxLyJsCw.png) From the Switchboard.Chat README: > *Evan made this for his wife Christina, a preschool administrator who was looking for a better way for teachers to communicate unplanned absences and to contact substitute teachers. She dreamed for there to be a way to reach and be reached by teachers (even those without smartphones) that did not necessitate using her personal cell phone. She also wanted a way for other school administrators to be able to view and sometimes help facilitate the notification processes. Subscribing to the mantra of "happy wife happy life", Evan created switchboard.chat to make this dream a reality.* > *There are many industries that still rely on SMS communications between employees, and are doing so with a shared (or personal) phones. This tool aims to make it easier for the managers of those teams to work with the tool they already use (SMS) but gain some control, centralization, and logging. Switchboard.chat aims to provide a cheap, sharable alternative to purchasing employee phones.* Here are some screen shots from the application: ![](/images/medium-export/0__NMkBcpHYcHJgXzcD.png) ![](/images/medium-export/0__j16fU__SE1WFSITa9.png) In a nutshell, switchboard.chat is an attempt to bring group SMS messages into the modern world. We’ve got / commands which you might be familiar with from tools like Slack or Hipchat (or video games before that). It was built with Angular 2, Bootstrap, Stripe, Twilio, and of course, [actionhero](http://www.actionherojs.com). I’m especially proud of the offline notification tools, so you don’t need to monitor the tool all the time, and will get notified when your team gets a messages. This (and the recurring billing system) make use of actionhero’s internal task system, which make this kind of thing a breeze. The site updates automatically with new messages (websockets) and being angular, is a pretty slick single-page app with a pretty neat collection of reusable components. **If you are part of a team that still relies on SMS for communication, give** [**switchboard.chat**](http://switchboard.chat) **a shot!** --- --- url: /blog/post/2012-03-30-testing-actionhero-with-blitz.md description: >- I recently had a chance to try out blitz.io (or click my affiliate link thing and get me some free tests), a new load-testing tool from… --- I recently had a chance to try out [blitz.io](http://blitz.io) (or click my [affiliate link](http://blitz.io/bhjPRfzrPxLMztQTzBZDFiR) thing and get me some free tests), a new load-testing tool from MuDynamics. They offer a product that is somewhere between Apache Bench and [Browser Mob](https://browsermob.com). [Apache Bench](http://httpd.apache.org/docs/2.0/programs/ab.html) has been around forever and is a command line tool to hit a page with many simultaneous requests and logs how long each request takes (and counts errors). Browser Mob runs a collection of cloud servers all over the world which actually render your site (using a full browser), capture screen shots, run JS, and look for errors. Both of these tools are awesome, but are useful in very different circumstances. When testing an API like [actionHero](http://actionherojs.com), I want to test more than just loading a url, but I don’t need the complexity (and cost) of Browser Mob’s full rendering powers. blitz.io is just what I need: ability to add (GET) variables, testing from multiple physical locations (I’m pretty sure they are EC2 based), and good support\*. Anyway, [demo.actionherojs.com](http://blog.evantahler.com/blog/demo.actionherojs.com) has been running for some time now on a micro EC2 instance, and I am curious to how it performs. I know that node.js is MADE to handle many http requests at once, but I wanted to know how much load the actionHero framework was adding to the server. blitz.io lets you test up to 250 simultaneous connections for free… and here are the results: The sub 10ms response time is why I think the blitz.io test servers are also in Amazon’s cloud. Even at 250 users, the server is working great! The test I was running (action=cacheTest) saves and recovers a key-value pair from actionHero’s internal cache. I ran the test a few times and I chose to show the most promising one here, but some of the other tests did show a few ( less than 10) dropped connections. I’m looking into it, but I suspect the problem has more to do with haProxy than actionHero. I don’t think that I really learned anything here about actionHero’s performance profile, except that it’s not terrible! If I didn’t have a linear "hit rate” or if the response time grew drastically as the number of connections increased, I would know that I have a problem. Specifically with node.js apps, the problem would be indicative of the fact that as some point the act of processing the request takes longer than the request itself. This would then add up as more and more requests were waiting in the queue. Looking good! ### Update! The blitz.io team just gave me a credit (due to this blog post) to increase my simultaneous user limit to 500. I ran the test again with some changed parameters * 500 users * Ramp up over 20 seconds rather than 60 * use a randomly-generated UUID as the key and value for the cache save to ensure unique-ness ![](/images/medium-export/0__QSBznimk9H52rcbb.jpeg) Still looking good, actionHero! This time when I ran the test, I decided to tail the actionHero log to see what was going on. All the requests were truly unique, but they all did appear to be coming from the same IP address (although it did change between each "rush”): ```text 2012-03-30 21:00:52 | action @ 10.72.245.143 | params: {"key":"key_23cc985520c55346d9de7a0e9e300d0f7f07a225","value":"val_23cc985520c55346d9de7a0e9e300d0f7f07a225","action":"cacheTest","limit":100,"offset":0} 2012-03-30 21:00:52 | > web request from 10.72.245.143 | responded in : 4ms ``` \*Good Support: They have an authentication process similar to Google Analytics where you need to have a specific (and random) URL return a pre-determined value to prove you own the domain. They have a fixed path you need to use which would have required me making a new action to respond to. I didn’t want to take down the demo server, so I asked them if I could authenticate by using a /public/ or /file/ path. They hooked my up in under 2 hours! --- --- url: /blog/post/2018-03-28-testing-node-apps-with-selenium.md description: >- The last time I used Selenium, in 2015, I hated it. It was slow, brittle, and difficult to get working. These days, it can actually be fun! --- The last time I used Selenium, in 2015, I hated it. It was slow, brittle, and difficult to get working. These days, it can actually be pleasant! ![](/images/medium-export/1__r8lWHbH__mgWkl462lQsYuQ.jpeg) Recently, in the [ActionHero](https://www.actionherojs.com) project, we found that we really needed a "full browser" integration test… something that we couldn’t mock or accomplish with even a robust tool like [request](https://github.com/request/request). We needed to ensure that our HTTP and WebSocket libraries properly shared session & fingerprint information, which required cookies, headers, and 2 "full" protocols in the test… so we needed a real browser :/ We recently switched ActionHero’s test suite from mocha to [Jest](https://facebook.github.io/jest/). Jest is an awesome test framework for javascript projects (and react, and other things that *compile* to javascript). It supports parallel testing, watching & retrying, mocking, snapshotting… all the tools I was missing coming from Rails, the gold-standard for TDD frameworks. It turns out that some wonderful person has already done the heavy lifting to make a full-featured integration between Selenium and Jest… ***and it’s actually simple to use!*** [**alexeyraspopov/jest-webdriver**](https://github.com/alexeyraspopov/jest-webdriver) What follows is a step-by-step guide to writing a "full-browser" test in Jest on OSX, complete with saving off photos of the page. First, you’ll need to install a few things into your node.js project: ```bash npm install --save-dev jest jest-environment-webdriver ``` if you don't have homebrew: ```bash brew install chromedriver ``` `chromedriver` is a version of the Chrome browser which is able to be "machine controlled" by selenium in our tests. Note that we do not need to install anything else like the selenium server. Jest already has support for multiple "renderers". This is how it handles testing compiled-to-javascript files, like JSX. This means that we can signal to Jest in a given test file that it should use selenium. Jest uses magic comments for this: ```js /** * @jest-environment jest-environment-webdriver */ ``` The default is to use `chromedriver`, which is what we’ll be using, but you can also test with Firefox, Safari, and other browsers. Using `jest-environment-webdriver` means that we get a few new global variables we can use in our tests, specifically `browser`, `until`, and `by`(full list [here](https://github.com/alexeyraspopov/jest-webdriver/tree/master/packages/jest-environment-webdriver)), which we will use in our test. From here on out, you can use normal Jest commands to start your server in `before` blocks, configure whatever you need… and control your browser in the test. We can continue to use the normal Jest/Jasmine [assertions](https://facebook.github.io/jest/docs/en/expect.html). In this example, we’ll be testing [www.actionherojs.com](http://www.actionherojs.com) for a few things, but you’ll probably be testing localhost. `File Location: __tests__/integration/test.js` ```js /** * @jest-environment jest-environment-webdriver */ const url = "https://www.actionherojs.com"; describe("www.actionherojs.com#index", () => { test("it renders", async () => { await browser.get(url); const title = await browser.findElement(by.tagName("h2")).getText(); expect(title).toContain("reusable, scalable, and quick"); }); test("loads the latest version number from GitHub", async () => { const foundAndLoadedCheck = async () => { await until.elementLocated(by.id("latestRelease")); const value = await browser.findElement(by.id("latestRelease")).getText(); return value !== "~"; }; await browser.wait(foundAndLoadedCheck, 3000); const latestRelease = await browser .findElement(by.id("latestRelease")) .getText(); expect(latestRelease).toEqual("v18.1.3"); }); describe("save a screenshot from the browser", () => { test("save a picture", async () => { // files saved in ./reports/screenshots by default await browser.get(url); await browser.takeScreenshot(); }); }); }); ``` Your test can now be run via the normal `jest` command\*.\* That’s it! ```text jest __tests__/integration/simple.js PASS __tests__/integration/simple.js www.actionherojs.com#index ✓ it renders (770ms) ✓ loads the latest version number from GitHub (267ms) save a screenshot from the browser ✓ save a picture (784ms) Test Suites: 1 passed, 1 total Tests: 3 passed, 3 total Snapshots: 0 total Time: 3.204s, estimated 6s ``` *Note that there is no need to start or stop the `chromedriver` or selenium server (this handled for you).* Selenium is very powerful ([full api docs here](http://seleniumhq.github.io/selenium/docs/api/javascript/)). You can type input, scroll the page, get and set cookies, etc. If you do find that you need a "full" integration test, this is a very painless way to do it! --- --- url: /blog/post/2025-01-15-the-components-of-an-ai-data-pipeline.md description: The ELT/ETL for AI --- ![title card](/images/posts/2025-01-15-the-components-of-an-ai-data-pipeline/image-1.png) Today, Airbyte is the best way to load data into your data warehouse or data lake. Whether your content comes from databases, files, or APIs, Airbyte can move your data [quickly and reliably](https://airbyte.com/blog/1-0-prime-time). Airbyte has built the largest [connector catalog](https://docs.airbyte.com/integrations/), and the tools to make your own, via our [development kits](https://docs.airbyte.com/connector-development/cdk-python/) or our [low/no-code](https://docs.airbyte.com/connector-development/config-based/low-code-cdk-overview) connector builder. The Airbyte platform runs in our [cloud](https://airbyte.com/product/airbyte-cloud) or your own [datacenter](https://docs.airbyte.com/enterprise-setup/). The Airbyte protocol and our orchestration engine are able to handle [incremental syncs or full refreshes](https://docs.airbyte.com/using-airbyte/core-concepts/sync-modes/), across multiple [generations of data](https://docs.airbyte.com/operator-guides/refreshes). Airbyte is the right choice for doing data movement for data warehouses, like [ELT](https://en.wikipedia.org/wiki/Extract,_load,_transform). However, the needs of modern AI applications, specifically those focusing on interacting with, providing context to, or augmenting LLMs, need more than what traditional data movement solutions can provide. This blog post aims to provide a taxonomy of the components of an AI Context Pipeline, and how it differs from existing ELT/ETL pipelines. Working backwards, the goal of an AI Data Pipeline is to produce a context collection for the LLM which will either provide documents for [RAG](https://en.wikipedia.org/wiki/Retrieval-augmented_generation#:~:text=Retrieval%20Augmented%20Generation%20\(RAG\)%20is,intelligence%20models%20information%20retrieval%20capabilities.) search or [Function Calling](https://platform.openai.com/docs/assistants/tools/function-calling) capabilities. The storage for this data looks less like a data warehouse (with multiple tables for different business objects which can be joined together on demand), and more like use-case specific tables with documents and metadata designed for [hybrid search](https://airbyte.com/blog/choose-a-database-with-hybrid-vector-search-for-your-ai-applications). Hybrid search means that you are able to both perform a vector/similarity search on the content (e.g. “Who are my customers in New York?” would return documents where users noted their address as “NYC”, because that is similar to “New york”), as well as traditional WHERE clause filtering on the data (e.g. where deal\_stage=pending and customer\_country=usa). So how do we get there? # The AI Data Pipeline We believe that an AI Context Pipeline has 6 major steps: Extract, Normalize, Load, Build Context, Evaluate, and Consume. Each of these steps produce artifacts which are consumable, observable, and replayable, much like the various tables in a [data warehouse medallion architecture](https://www.databricks.com/glossary/medallion-architecture). Each of these 6 major steps and their subcomponents can be modularized. ![flowchart of the data pipeline](/images/posts/2025-01-15-the-components-of-an-ai-data-pipeline/flowchart.png) ## Step 1: Extract This is Airbyte’s bread and butter, and building connectors while maintaining them is hard. In fact, we couldn’t do it without the largest [open-source community](https://github.com/airbytehq/airbyte) of connector developers. API changes (both planned and unplanned), data-type bugs, rate limit problems, vendor outages, and handling authentication quirks are only some of the problems that Airbyte’s connector development teams solve. Airbyte also works with vendors to sync data incrementally whenever possible (e.g. syncing only the changes since last time), to be as fast and cost-efficient as possible. On top of that, there are questions of syncing strategy. For example, would this particular API source benefit from parallelism, or would that make things worse? Does this database support [CDC replication](https://en.wikipedia.org/wiki/Change_data_capture), and if so, what’s the best way to implement it? We’ve learned that syncs are stateful (needing cursors, refresh tokens, etc) and produce a schema. Airbyte handles all of these things. Airbyte’s AI pipeline leverages our existing connector catalog and syncing engine to robustly get your data out of source systems into the first stage of this pipeline. In addition to the above, there are 2 additional concepts that are important when thinking about AI use cases: dealing with unstructured/non-textual content, and permissions. Due to these new requirements, we are building custom blob-storage sources for our AI Pipeline that are different from the standard Airbyte sources. ### Extracting Unstructured Data & Metadata For AI use-cases, you need to get text from all of your sources. For APIs and databases, this is easy, but things get weird when object storage sources are involved (e.g. SFTP servers, S3 buckets, Google Drive, etc). Specifically, we are talking about the ability to read a directory of PDFs or Word documents, and extract the text contained within. This is sometimes called [OCR](https://en.wikipedia.org/wiki/Optical_character_recognition), parsing, or simply reading unstructured documents. Airbyte has built expertise in this area over the past year and integrated this capability into our S3, Google Drive, and similar object-storage sources. When running sources in this mode, Airbyte will produce large textual records for each document found (e.g. PDF file), with the content converted to [markdown](https://en.wikipedia.org/wiki/Markdown). We find that markdown strikes the right balance between being machine and human consumable, maintaining enough of the semantic and layout information from the document (headers, links, etc), while still being performant and small. Along with the markdown version of the content, we’ll also extract metadata from the original source (original file name, mime information, etc). You’ll want to use this information in later steps in the pipeline. ## Permissions and ACLs In traditional data warehouse work, access to the tables in the warehouse is controlled by the humans on the data/analytics team. If the viewer’s role is appropriate, they can view the table. For example, anyone on the finance team can see the purchases table/mart, but they can’t see the whole customers table/mart. It’s likely that those [data marts](https://aws.amazon.com/what-is/data-mart/) were produced by combining multiple source tables together, which in the source system, had a divergent permission model from what we want to accomplish in the data warehouse. So, because they don’t really relate, the permissions from the source system could be ignored as part of the extraction process and rebuilt later. This might not be the case in AI workflows. When building secure AI applications, it is imperative that the context provided to the LLM only include content that both the machine and end-user is allowed to see. Relying on the AI itself to guard sensitive information has been regularly shown to be a [flawed](https://genai.owasp.org/llmrisk/llm062025-excessive-agency/) approach. To this end, when thinking about AI applications, a multi-stage permission model will be needed: * Source permissions: What users or groups could originally see this content? * Context collection permissions: What users or groups have access to this bundle of content? An example might be helpful - Say we are ingesting content from Salesforce and we are building a copilot for our sales team to prepare for calls. Perhaps your sales team is divided regionally, and in Salesforce, AEs are limited to seeing information only within their own countries. It should follow that the deals each AE is allowed to ask the LLM about should follow those same rules in the copilot. That information could be reflected in your Salesforce configuration, and if so, we should include a list of groups that have access as metadata to each opportunity and contact we extract. But, that information is equally likely not to be available in the Salesforce API, so we’ll need to re-create it in our context collections. In that case, we could use the contact’s country code to split the data we have into USA and EU collections, and then grant access to each that way. Modeling permissions properly depends on the source’s capabilities and the use case we are building to. Knowing that this work will be custom for your business, we are building first-class primitives to model and manipulate permissions throughout the pipeline. Learn more about our approach to handing permissions and identities [here](https://airbyte.com/blog/permissions-for-ai-use-cases). ## Step 2: Normalize Once we’ve extracted the data from the source, we need to get it into a shape (né schema) we understand. This is a process called normalization. We generally want to prepare data for the downstream steps in a known format - we want all data that looks like a contact from a CRM system (Salesforce, Hubspot, etc) to have the same shape and data types. This will make future transformation easier and composable. We also want to have a spot to deal with API drift from upstream APIs. Finally, in some cases, we will want to merge data from a few streams into one. This is also the step where we validate the data we got from upstream, against an expected schema. At the end of this step, we’ll have incremental data files in known format, now ready to load into our working database. ## Step 3: Load (Datasets) In a [blog post](https://airbyte.com/blog/choose-a-database-with-hybrid-vector-search-for-your-ai-applications), I talked about the properties a database needs for hybrid search. From this point forward, we’ll be working with our data loaded into a platform like Motherduck, Clickhouse, or Elastic, which, at the time of writing, are some of the best open-source horizontally-scalable candidates for this kind of application. To start, we are working with MotherDuck/DuckDB databases. The schema for our data in this database is roughly: * `{organization}-{workspace}-{dataset}` * `{organization}-{workspace}-{dataset}-{collection}-records` * `{organization}-{workspace}-{dataset}-{collection}-documents` Datasets are the root node of analysis, and are more-or-less 1-to-1 representations of the data in the source system. In Airbyte lingo, datasets are the destinations of an incremental sync. Every time we sync against your source (and we can check for updates on the order of every few minutes), all the new and updated information lands here. This is also the place where sync errors will be displayed if they occur. From datasets, we build multiple context collections. ## Step 4: Build Context (Collections) Context Collections are the *use-case specific filters and transformations* of a dataset that are needed to power *a specific AI application*. There’s a lot to unpack in this definition: * Filters will remove data from the dataset in this context collection, if desired. This means you can filter a dataset into multiple context collections with subsets of the data. * Transformations (sometimes called mappings) are the creation of additional columns in the dataset that are derived from the data we already have A Context Collection is use-case specific. Depending on what you are trying to do, you might take the same dataset data and manipulate it in different ways to get the outcome you want. Different AI/LLM applications tend to do best with different search documents. Or, you might want to bifurcate your dataset based on permissions or access roles. It’s important that datasets and collections stay in sync - as new data is added to the dataset, all the collections related to it need to be updated soon afterwards. Airbyte’s AI studio takes care of this for you. The context collection is the most novel part of an AI data pipeline, and where we will be spending the majority of our time with design partners. There are thousands of possible manipulations to data which might be desirable, and we are interested in finding the most common patterns and building the tools so you can extend them. The entire workflow of preparing a record for use includes: 1. Filtering 2. Transformation / enrichment / applying mappings 3. Document creation 4. Calculating embeddings 5. Evaluations 6. Making ready Before we get into each step, let’s talk about the workflow of the whole process. Airbyte is building a “Context Collection Playground” so that you can experiment in close-to-real-time with each of these steps. In time, you’ll be able to run these steps locally, and check in your work to git. This is where your use-case specific business logic lives, your company’s secret sauce. You’ll also want to compare versions of collections against each other for accuracy, cost and speed. We aim to make this activity easy and fast. ### Filtering The first step is to remove any records from the dataset that you don’t want. Maybe EU data shouldn’t be included in a collection for USA users, or perhaps you want to introspect the permission data which was extracted from the source to make your decisions on what to include (e.g. removing internal users). It’s important to assume that anything that’s in the collection could be leaked by the LLM to the user, so be sure that what you are including is safe. If you have different groups of users with different access rights, make different collections - we make this easy. Note that this pre-filtering is not the same as search filtering. This layer of filtering is to remove content that no one should see. However, if there’s information that only some users should see (e.g. “show my what meetings I have today”), and you can represent that query with strong guarantees at query time (e.g. where user\_id = 123), then you might not need to pre-filter those documents… Thanks, hybrid search! ### Transformations and Enrichment We think there are a few rough categories of transformation folks might want to do to the data in a collection, and we call each of these a “mapper”, and each of these adds a new “virtual column” to your data, which you can then use for future mappers and in the record’s document. The general types of mapper are: * **Removal** - You may want to null out a particularly sensitive piece of data so that it is not available downstream or to the LLM. * **Derived** - Using the data already present, you may want to compute a new property. For example, you may want to combine first\_name and last\_name into a new full\_name property, or you might want to extract the domain of a user’s email address after the @ sign to guess what company they work for. Today, we provide a Jinja interface to your data with custom filters that make this pleasant. * **API** - You might want to enhance your data by hitting an API. There are Enrichment APIs to learn more about users by their email address, or look up if they’ve been banned by your payment gateway by user\_id, etc. * **AI / LLM** - Passing the data you have to an LLM is a great way to produce summaries, do sentiment analysis, and other non-deterministic tasks. * **Cross-dataset Joins** - You may want to join data from one dataset with another, for example adding your internal user\_id to your Salesforce data. * **Code** - And finally, for any use-case not covered here, we will allow you to run custom code against each record to produce new fields. ### Document Creation Once you have all the data you’ll need, thanks to the previous transformation steps, it’s time to build the search document. This is the text that will be embedded and used for vector/similarity search. Producing good documents is both an art and science. You want to provide the LLM with the fewest documents with the highest quality that get the job done, but you also want each document to be internally complete. Oh, and each record might need to be broken up into multiple smaller documents to fit in a limited context window. For example, consider a very large PDF of a legal contract you might be interested in searching. The document is 100s of pages long, so you can’t feed the whole thing into some LLMs, so you need to chop it up (a process called [chunking](https://www.pinecone.io/learn/chunking-strategies/)). Chopping the document up by section (e.g. via heading) is probably a good starting point, but it’s often the case that the section you are on might not have the full context to explain itself (i.e. not internally complete). If this was a commercial real estate contact, and you were trying to ask the question “who is in charge of repairing the elevators?”, the document chunk would need both the exact text “... and the repair agency will be responsible for preventive maintenance and repair, visiting the site not less than 4 times a year”, and the preamble “Evan’s Elevator Repair Company LLC will be known as ‘the repair agency’”. Knowing that relevant document creation for a commercial real estate contract probably needs the current chunk, the full tree of section headers, and the preamble is both hard to figure out, and the secret to building a powerful AI application. Building an application which will enable you to iterate quickly and test your results is a big part of what we are building. ### Calculating embeddings Once you have your document built, you need to turn that text into embeddings - a vector representation of the text that can be similarity searched quickly. This is a compute/GPU-intensive step that we make easy and parallelize. There are [hundreds of algorithms/models](https://www.sbert.net/docs/sentence_transformer/pretrained_models.html) you can choose for this step, and experimenting with the cost and speed of each will be important. Of note, what you embed probably won’t be the document itself. A section, a summary, or even a [representation of the document that was produced by an LLM](https://www.anthropic.com/news/contextual-retrieval) might produce better results. ### Making Ready Within the collection itself, there’s a lifecycle to consider. Some of the transformations (and later evaluations) might be slow or expensive, especially if an external API is involved. That means that a useful collection needs to serve the current version of a record while the new version is still processing. The collection also needs to be resilient to any errors that occur in the transformation process or retries that are needed when hitting external systems. Airbyte leverages our syncing experience to handle this lifecycle gracefully and expose the metrics you will need to understand this workflow. We deduplicate older versions of the records away as new updates flow through the system and become ready. ## Step 5: Evaluate Like all software, you need to test it to be sure that it works. Because everything above this step was leading up to an LLM using the data, and LLMs can have unpredictable behavior, the only way to know changes are helpful are to test changes against a robust evaluation suite. It’s important that we make this easy, fast, and fun. Testing document creation has a few forms that we will be building out: * **Deterministic testing** - Given this query or search, are the proper documents returned * **Scoring** - Given this set of transforms and documents, do the results of a search of known content produce better or worse results * **Feedback/voting** - Given this set of transforms and documents, are the results our application is providing better or worse, according to our users. Some of these evaluations can be expressed within a testing framework like pytest, while others require collecting data from users and evaluating after the fact. Testing context documents is both an art and a science. ## Step 6: Consume (APIs and Functions) It is finally time to use your data! Once the everything is ready, the same collection can support a number of interfaces: * **Hybrid search** - “What users are in Houston?” can be answered from our Salesforce contacts collection, which may also know to apply traditional filters to only consider records in which the user is active. * **Function calling** - “How many users do we have?” is best answered by a COUNT SQL query, and we’ve made it possible to run text-to-SQL and execute the query via a functional interface and agentic workflow. We can also combine the access to multiple collections and specialized instructions in an “assistant.” These are agents that determine which context collections to use and how to use them based on varied user input. There are even more ways to consume this data and we can assist in co-designing your applications accordingly! # A Context Collection Example Putting this all together, let’s walk through an example. The use-case is the same as above - we would like to build an AI copilot for our sales team, using our Salesforce data. We are starting with two use cases in mind: * We want a chat interface (co-pilot) to use mid-call to ask questions of the prospect’s account (e.g. “when was our last contact” or “how much did you pay last year in your contract”... or even “Is customer X likely to churn?”) * We want an application interface for the manager to understand the status of all the deals in flight (e.g. “how many deals are open now that are likely to close by the end of the month?”) I’ve chosen these two examples because one operates on specific data about one object (the prospect’s history), while the other operates on a set or aggregation of all the data in the collection. The first use case could be solved by RAG, and the second with a SQL function call. The table below ties the above concepts together and shows what an AI pipeline for this application might look like. ![example steps](/images/posts/2025-01-15-the-components-of-an-ai-data-pipeline/example.png) ## RAG At the end of the Build Context step, the documents we create from our Salesforce data in the USA\_CONTACTS\_COLLECTION could look something like this: ``` # Contact: Evan Tahler ## Details: * Email: evan@airbyte.com * Phone Number: 123.456.7980 * Role: Director of Engineering ## Company Details * Airbyte * Address: San Francisco, USA * Website: http://airbyte.com ## Product Interest Airbyte Cloud ## Opprtunity Status * State: Open * Estimated Annual Contract Value: $100 ## Other Comnpany Contacts * Frank * email: frank@airbyte.com * role: account manager * Sally * email: sally@airbyte.com * role: CEO ## Interactions: 1. Cold Call * Summary: cold outreach, customer showed interest and wanted more product details. * Next Step: Call next week 2. Sales Call * Summary: call went well - customer wants pricing info customized for them. * Next Step: email in new year re pricing ``` These documents are then embedded and stored alongside the traditional columnar data that provided this information - an opportunity-name and deal-status column would exist as well. Of note, we’ve taken relational data from a few Salesforce APIs and combined it into a document - LLM’s preferred format. The chat interface works via RAG. During the sales call, we might ask our chatbot “who else have we talked to at Airbyte”, and it would find the document above and pass that as context to the LLM, which would then extract the useful information we asked for -”Frank and Sally”. You can learn more about how RAG and Hybrid Search work in our [previous blog post](https://airbyte.com/blog/choose-a-database-with-hybrid-vector-search-for-your-ai-applications), but in a nutshell, we are asking the database to find documents similar to the keywords in our question, to then pass to the LLM to sort through - letting the LLM judge what is relevant from the context provided in this document. The RAG query will likely provide not only the right document, but probably some false-positive’s as well (e.g. if the notes from another call mentioned “Airbyte”, or perhaps if another company name is similar). LLMs have powerful reasoning capabilities to sort though this… if they are given the data. ## Tools The manager’s question is a little different - “How many opportunities are currently in the open state”. This is a question that can’t be solved by RAG alone, as reading every deal would be required, and that won’t fit into the LLM’s context window… and we want an exact numerical answer here, not a summary. This is where [tools](https://python.langchain.com/v0.1/docs/modules/tools/) come in. Tools provide “APIs” or “functions” to allow the LLM to ask a third party service for information. Tools self-describe when they should be used, and their inputs and outputs. Airbyte provides a text-to-sql interface over all of the collections we build for exactly this purpose. Visualizing a tool definition looks something like this: ``` tool: * Name: text-to-sql for collection "USA CONTACTS" * Description: When asking for specific counts or other aggregations of data pertatining to salesforce opportunities in the USA, use this tool. The schema of the table is: * id: integer * document: text * deal_status: text ... * inputs: * agregation: enum[count, average, sum] * aggregation_column: choose from the columns in the table's schema * column_filters: what should the aggregation_column be filtered on * output: number ``` The information above is enough to allow the LLM to do a few things: 1. Decide when to use the tool. It now has enough context to know that if the user is asking for a “count” or "aggregation" and to try calling this tool. Otherwise, it might choose another tool (RAG search is also expressed as a tool), or be able to answer based on its training data or the messages already present in the conversation. 2. The tool’s input is described so that the LLM can keep asking for more data from the user until satisfied. In this case, \[count, deal\_status, open] would be the inputs, leading the tool to eventually run the query select count(\*) from collection\_table where deal\_status=”open”. 3. The tool’s output is described as well, so that the LLM can format a reasonable response, e.g. “there are 7 open deals”. As the collection provides multiple tools, including RAG search and SQL function calling, a rich interaction can take place where the manager can ask for more information about those deals, and the LLM can switch over to loading the relevant RAG docs when asked. # Summary The goal of an AI Context Pipeline is to extract and prepare data into a format that is appropriate for multiple LLM use-cases. This includes document creation and embedding, but including what you need for additional analysis and tools is also important to build robust and powerful AI applications. If you are interested in learning more about how Airbyte can help you with these pipelines, please reach out! --- --- url: >- /blog/post/2013-02-28-the-real-reason-i-will-not-be-your-technical-cofounder.md description: >- I have been reading quite a few posts lately concerning the topic of becoming a ‘technical co-founder’ of a startup. Just search for the… --- I have been reading quite a few posts lately concerning the topic of becoming a ‘technical co-founder’ of a startup. Just search for the term on Hacker News, and I’m sure you will find posts on the topic from within the current week. Many other people have spoken eloquently on the topic, but I have always felt that the conversation was dancing around one key issue: **The #1 reason that I will turn you down is that I don’t respect you.** If you are not presently working on your product because you don’t have the skills, I will assume it is because you are lazy and not intelligent enough to do so. There I said it. It’s out there in the open, in all its cold mean glory. I am not saying that I implicitly don’t respect non-technical founders, as quite the opposite is true! What I don’t respect is non-technical co-founders using their non-technical-ness as a roadblock. Susan Koger, the founder of [ModCloth](http://www.modcloth.com) is not a technical person, but I greatly respect her expertise in her industry, her leadership, and the company she and Eric have built. The Kogers cobbled together the first version of the site in a few days with some ugly PHP and did all fulfillment by hand for a number of years. There were *tons* of optimizations a more ‘technical co-founder’ would have done, but they didn’t let that stop them. They made it work and organically attracted the help they needed to grow. The team I currently work with at [TaskRabbit](http://www.taskrabbit.com) I respect more than any group that I have worked with to date. I am constantly impressed with the ‘non-technical’ side of the house. The best example I have is that one night while I was leaving, I was asked for an SSH tunnel to one of the database replicas by our Member Services team. It turned out the WHOLE MEMBER SERVICES team had taken the initiative to take online SQL classes together, and wanted to explore our data. I stayed late and granted that access. Today, I helped our head of Trust and Safety write some ruby that allowed him to automate rather complex data extractions into a single table. It was terrible, buggy code, but it worked. He didn’t want to give the task to the engineering team yet because he wasn’t ready to ‘production-ize’ it until he was able to try it out first. We have email marketers who write their own HTML and CSS, and write it well. We have PMs who create micro-sites and understand the gory details of oAuth because of it. We have Busniess Development folks who can solder and write python. I respect this team. I’m a fairly terrible designer, but that doesn’t stop me from attempting to create a look & feel for my sites. I love the term "Programmer Art" which the video game industry uses when describing the blocky placeholder models which populate the pre-alpha releases of almost any game. Programmer Art is one of the best communication tools which exists. It communicates spacial layout, movement assumptions, rendering limitations, and most importantly, allows the rest of the team to clearly see which sections need the most help and can then prioritize their own work. My point is this: If you are a ‘business guy’ who thinks you can’t start making your website or app because you need a technical co-founder, you are wrong and wasting your time. There are literally hundreds of tutorials, free software downloads, books, examples, and templates out there for you. There are tools like DreamWeaver that will let you make a website as easily as making a Power Point deck. And it will be terrible and incomplete. However, I would **always** rather see a terrible but functioning demo than a pretty deck (speaking as a potential technical CoFounder). You will gain valuable insight into what will eventually become a key part of your business even if you try and fail. Your effort shows me you care enough to do it, and are smart enough to try. Also, I have a job. --- --- url: /blog/post/2023-01-19-airbyte-connector-release-stages.md description: Airbyte Connector Release Stages --- ![Connector release stages](/images/posts/2023-01-19-airbyte-connector-release-stages/image.png) As of the start of 2023, Airbyte has over [300](https://docs.airbyte.com/category/sources/) connectors in our Open-Source repository. At [Move(data)](https://movedata.airbyte.com/), our first conference, we [announced](https://airbyte.com/movedata-announcements) that we will soon be bringing most of these connectors to Airbyte Cloud - for free while they are in the Alpha or Beta Release Stages! In this post, I want to share more details about our Connector Release Stages, and how Airbyte uses them to ensure that you only pay for the most reliable and well-tested connectors. My name is Evan, and I’m the Engineering Manager for the Connector Operations team. The Connector Operations Team focuses exclusively on what it means to test, publish, and manage Airbyte connectors. In 2023, we are going to dramatically grow the number of sources and destinations that Airbyte can read and write to. In addition to our normal home-grown and community contributed connectors using the [CDK](https://airbyte.com/connector-development-kit) (connector development kit), we are soon going to release our [low-code connector builder](https://airbyte.com/movedata-announcements). This will make it so that you can build a connector for any API source without even writing a line of code. This will lead to a lot more connectors being developed in 2023! ## Overview In early 2022 we introduced a grading system for connectors called “Connector Release Stages”, and have been evolving them ever since: From our [docs](https://docs.airbyte.com/integrations/#connector-release-stages): > Airbyte uses a grading system for connectors to help you understand what to expect from a connector: > > **Generally** **Available**: A generally available connector has been deemed ready for use in a production environment and is officially supported by Airbyte. Its documentation is considered sufficient to support widespread adoption. > > **Beta**: A beta connector is considered stable with no backwards incompatible changes but has not been validated by a broader group of users. We expect to find and fix a few issues and bugs in the release before it’s ready for GA. > > **Alpha**: An alpha connector signifies a connector under development and helps Airbyte gather early feedback and issues reported by early adopters. We strongly discourage using alpha releases for production use cases and do not offer Cloud Support SLAs around these products, features, or connectors. When we build connectors at Airbyte, of course we write unit and integration tests like any good piece of software. We also write acceptance tests which we run against what we call [SAT](https://airbyte.com/blog/black-box-testing-data-connectors), or the Source Acceptance Test suite against real data in a sandbox account. For destinations, there’s also the DAT, or Destination Acceptance Suite. This collection of robust tests ensure that the connector performs well against our test’s expectations, and that the data we have in our sandbox accounts is properly transformed according to the [Airbyte Protocol](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol/). However, the world of data integrations is full of complexity. Just because all of the tests are passing, that doesn’t mean that the connector will work performantly, and work for all of our user’s data. Perhaps the source operates differently depending on the type of account you have, exposing or changing the streams available to sync, or the content contained within. Also, in some cases, it’s quite hard to create a robust sandbox account with all the available test data. For example, simulating a Stripe account with every type of refund is non-trivial. It is with this mindset that we’ve added “Certification” to our [Testing Pyramid](https://martinfowler.com/articles/practical-test-pyramid.html) - to account for these unknown unknowns. ![Connector testing Pyramid](/images/posts/2023-01-19-airbyte-connector-release-stages/pyramid.png) ## Release Stage Philosophy Let’s dive into the connector lifecycle. We group the requirements for connectors into the following categories: * Usage * Test Coverage * Reliability * Docs * Streams * Featureset * Databases and Destination Specific Concerns You will notice that the first category is “usage”, and that is intentional. As connectors already are well-tested by this point, the main goal of certifying a connector is ensuring that we’ve seen enough different use-cases which are sufficient to expose any bugs we wouldn’t have gotten from our sandbox dataset. A diverse set of use-cases is most easily measured by counting the distinct Airbyte workspaces (some from OSS users and some from Airbyte Cloud) using the connector, and each release stage has an ever-higher bar. Yes, our community (and soon Airbyte Cloud users) running Alpha and Beta connectors is a requirement of getting a connector to GA. This is one of the many reasons we want to encourage you to give them a try and let us know about your experience with the connector! We’ve built telemetry in to capture crashes and bugs from Airbyte users so that we can be sure that connectors really are performing at the level we expect. ## Requested & Pending Connectors Connectors begin life with you! When you discover the need to move data from an internal or external API that Airbyte doesn’t yet support, that is when a connector is born. For connectors that we manage, we have a number of ways which you can [request](https://airbyte.com/connectors/request-a-connector) and [vote](https://airbyte.com/connector-requests) for a connector for us to focus on - these are called “Pending” connectors. ## Alpha Connectors Once the connector exists, and it can move the data for at least one stream, it moves into the Alpha stage. Connectors in this phase are experimental, and can be thought of as a MVP (Minimum Viable Product). Not all streams and edge cases will be supported yet, but you can certainly use these connectors to get some data into your data warehouse! ### Alpha Criteria: **Usage‍** * None, as this is a new connector **Test Coverage** * The connector passes the SAT test suite **Reliability‍** * None, as this is a new connector **Docs** * Basic documentation is provided so that a technical user can set up the connector, via a connector’s Setup Guide in the docs * The connector’s specification includes all the required information from a user (e.g. API keys and account information) * A CHANGELOG for the connector has been started **Streams** * All applicable streams have a primary key * Incremental Syncs supported on all streams which offer them **Featureset** * oAuth, if applicable for this connector, works **Databases and Destination Specific Concerns** * Not Applicable ## Beta Connectors After a connector has been released and we are seeing usage, it can move into the Beta release stage. The goal of Beta is to move a connector from MVP to MLP - Minimum Lovable Product. While an Alpha connector provides the minimum amount of data to be useful, a Beta connector should provide all the data anyone could reasonably want from that source, or at least all the data the source API will provide. ### Beta Criteria: **Usage‍** * At least 25 distinct workspaces between Airbyte OSS and Airbyte Cloud are actively using the connector at the time of certification. **Test Coverage** * At least 90% unit test coverage. **Reliability‍** * All available streams include expected records i.e. save examples of the records produced as snapshots * Source can reliably access all streams (or skip them if they can’t) i.e. we can gain the required permissions to sync * Any severe open issues have been closed * The connector checkpoints i.e: periodically saves its sync progress * The connector is on the latest CDK & SAT versions **Docs** * Icon required * Links to vendor or API docs explaining what each stream is and how to use it **Streams** * (same as before) **Featureset** * oAuth, if applicable for this connector, works. * Secure Connections **Databases and Destination Specific Concerns** * Connector properly supports multiple data types Waiting for the connector to obtain sufficient usage is how we collect bugs and issues that, once triaged, help us become confident in the connector’s quality. Beta connectors introduce the concept of “expected records”. This is our shorthand for saying that we need to do the work to fully seed our sandbox accounts with data for every stream. Then, we need to capture the data produced from a sync and check it into the codebase as a snapshot of that data. This allows us to run a sync with the connector [every night](https://github.com/airbytehq/airbyte/blob/master/airbyte-integrations/builds.md) and compare the data produced today with the previous snapshot. We will be alerted to 2 different types of failures: code changes that break the connector and changing upstream APIs. While we only publish new connector versions to Docker Hub when we make a change, we do test the build process for every connector against our latest CDK to ensure that the connector still can be published and take advantage of any new features or speedups we add. The second case, changing APIs, is when an API provider modifies the data returned from an API endpoint. This alerts us so we can decide how to handle it. If a new property is added to a stream, but everything else stays the same, that’s a patch [semver](https://semver.org/) (semantic versioning) change, as Airbyte can handle adding new columns to your data warehouse. But a type-change or a removal is more severe, and that requires us to think about how to handle it, and how to communicate it to our users… usually resulting in a breaking change (and major semver version bump) of the connector. Airbyte’s job is to make sure your data pipelines don’t break, so thoughtfully considering what to do when a provider makes a breaking change is what we are here for! Beta connectors also are required to checkpoint. Checkpointing is a word we use at Airbyte to mean that data output from the source is frequently “saved” in the destination. Checkpointing is good because it enables a failed incremental sync to be reliably restarted without needing to re-import the majority of the data from the previous sync attempt. The Airbyte Protocol has more detail on the topic, but in essence, we rely on the source emitting state message regularly and the destinations regularly committing data to disk… and then informing the Airbyte Platform that everything has been persisted up to that state message - a checkpoint. State messages indicate a resumable sync location - like the page of the API we are on, or the OFFSET in a SQL source. Beta and GA sources need to emit a State message at least once every 15 minutes, and Beta and GA destinations need to persist the data that they have received at least every 15 minutes. This means that at worst-case, Airbyte will persist all that data that’s been moved to your data warehouse at least once every half hour. The Airbyte Protocol has support for many different types of data (strings, numbers, dates, nested objects, etc). At the Beta phase, we want to be sure that the connector properly represents its data as correctly as possible using the proper [Data Types](https://docs.airbyte.com/understanding-airbyte/supported-data-types/). Finally, for all of our connectors, we want to ensure that there is a way to connect securely to the source or destination. For API sources, this usually means connecting via HTTPS. But, for many of our database connectors, the protocol may only have optional SSL or encryption. For example, you can connect to postgres without requiring SSL. At the Beta stage, a connector must provide multiple ways of connecting if a secure protocol is not not required by the upstream source or downstream destination. If a connector cannot connect over a secure protocol, it will not be released on Airbyte Cloud, even though it may be available to our OSS users. In fact, we strip out any insecure connection options for connectors deployed on Airbyte Cloud - you are required to use HTTPS, SSL or a SSH proxy - which is why this requirement exists 🔐. ## Generally Available Connectors (GA) Finally, a connector can become Generally Available when it passes our strictest criteria. Most importantly, this means that we believe that this connector is as robust as we can make it. We are confident of the connector’s quality, and are therefore able to provide [support](https://airbyte.com/pricing) for it. Only at this point will we charge our customers to use these connectors on Airbye Cloud. **Usage‍** * At least 50 distinct workspaces between Aribyte OSS and Airbyte Cloud are actively using the connector at the time of certification. **Test Coverage** * (same as before) **Reliability‍** * All available streams include expected records and SAT tests. * Configuration options can reliably access all streams * Any severe open issues have been closed * The connector checkpoints * The connector is on the latest CDK & SAT versions **Docs** * Docs sufficient for a moderately-technical new user to configure the connector. **Streams** * (same as before) **Featureset** * oAuth, if required by vendor * Secure Connection Options * Secure Connections * Connector checks are fast **Databases and Destination Specific Concerns** * Connector properly supports multiple data types * Database sources support column selection Once again, the most important criteria is that a lot of customers are using the connector so we can collect and fix any bugs that might exist. In addition to a higher testing and usage bar, GA connectors also add the requirement that we can support column selection for databases. This forthcoming feature allows you to specify which columns you want to move to your data warehouse, leaving behind unimportant or secure data. At this stage, we also work with the API provider (in the case that the connector is an API Source) to meet any criteria they have to be in their catalogs. Usually, this means adhering to rate limits, sending specific headers, and implementing oAuth to their satisfaction. Finally, GA connectors need to operate at a reasonable speed, including at the setup and check phases of the sync. ## Summary Airbyte takes connector quality seriously. We plan to offer ever more connectors, and we have a robust program to test and certify these connectors. We want to strike the right balance between having a large catalog so that you connect to the largest breadth of data sources with Airbyte, and also ensure that you are only being charged for top-quality connectors. We do this via our connector release stages - Pending, Alpha, Beta, and Generally Available. We are constantly evolving what it means to be a “good” connector at Airbyte, and our list of requirements for each stage is getting stricter all the time as we hear feedback from you, our users. We are continuously testing more edge-cases in SAT, and adding new requirements to our certification checklist. With that in mind, some connectors which were certified to the Beta or GA levels in the past, might not pass all the current requirements. However, we have built tools to alert us of this and have a team constantly revisiting and maintaining all of our connectors, and getting them up to the latest standards. As we evolve our connector release stages, we will keep this blog post up-to-date. Thanks for using Airbyte, and we look forward to making connectors with you! --- --- url: /blog/post/2019-03-02-the-voom-software-engineering-interview-process.md description: >- Building an inclusive and efficient interview process for a pair-programming team --- ![](/images/medium-export/1__9HNP1gUTyhHlUcbJBNYpgg.jpeg) ### Introduction Over the past year, we have been working on creating an interview process for our engineering team with these 4 goals in mind: * Efficiency * Focus on what the job will be like * Test only relevant skills * Be inclusive Airbus’ [new UAM (Urban Air Mobility) division](https://www.airbus.com/newsroom/stories/urban-air-mobility-the-sky-is-yours.html) has given us an incredible amount of leeway to craft a process that best meets our needs and helps us to find the best talent available. You will note that our goals are bi-directional: they help both the candidate and the company. We are trying not to fall into common software engineering interview traps, and have come up with a rather unique process we are excited to share. Inspired by interview processes we feel that have worked well ([Pivotal Labs](https://pivotal.io/labs), [TaskRabbit](https://www.taskrabbit.com/), [Ministry of Velocity](https://www.ministryofvelocity.com/), and others) and aligned with our [Key Values](https://www.keyvalues.com/voom), here are the steps for the Voom Software Engineering Interview. ![](/images/medium-export/1__4a__pBxd6n9jO3Fg9l____VqQ.jpeg) ### Interview Steps Below are the actual emails we send to candidates along the way, followed by our philosophical rationale of why we do it. #### Interview Step 1: Hiring Manager Meet & Greet > Hello! > > The goal of this first one-hour interview is to meet you, and share the company and role to you in detail. We will be explaining how we work, what our short term and long term goals are, and what you can expect on an average day. We’ll be asking you questions about your past roles, what you liked, and what you are looking to get out of your next position. We will also lightly touch on your technical skills, but this is not a technical interview. > > We can meet either in person at our office in Downtown Seattle, or via video call… This first meeting is very social, often conducted in a coffee or doughnut shop. This helps the conversation be more… like a real conversation! Communication is important to us and engineers at Voom interact with other people often! This first meeting is conducted by 2 people: the hiring manager and another member of the engineering team. Not only does this help to remove any single person’s bias, it also helps smooth out the power dynamics of the interview *(Inspired by* [*this post from Marco Rogers*](https://firstround.com/review/my-lessons-from-interviewing-400-engineers-over-three-startups/)*)*. This is easier to do this in person than over video call, but we are flexible. We focus a lot on how we work as a team. While we meet many people who love to pair program, it’s not for everyone. It’s important that we share expectations on both *how* you will work and *what* you will work on. If things go well, the lead interviewer will let you know on the spot and explain what the next steps are in the interview process. #### Interview Step 2: Pair-Programming "Clean Room" exercise > Congratulations and thank you for continuing with the Voom interview process! > > The goal of this interview one-hour is for a senior member of the team to get a sense of your technical abilities and experience. We will focus specifically on the tools we use (Ruby & Javascript) in a structured "clean room" pairing exercise. It will also focus on the engineering process at Voom: pair-programming & test-driven development. Now we start to get technical: * "Clean room" means you will be starting from scratch, not using an existing codebase. * The "classic" example of this exercise comes from Pivotal Labs: "Let’s pair to re-create the Java Array Class without using the standard library". *We do something different of course, but the spirit is similar.* * A correct answer starts with a discussion about what the most important parts of "Array" are, and then uses TDD (test-driven development) to make failing tests pass. For example you might first write a unit test that fails for a "push" method, then make it work. Then, you can write a failing test for a "length" query method, then make it work… and you can then finally write an integration test that checks that "push" increased the response value of "length". * We don’t focus on your knowledge of the proper method invocations in the language, but on your understanding of the concepts being used. * We are purposely vague about which language the exercise will use, as both are important. * The real focus here is modeling a complex domain and having a conversation, not getting it "right". While this is not a demo of our codebase, it does showcase *how* we like to work. We pair 100% of the time and we write a lot of tests. If you have fun and excel in this interview, it’s a good hint that you might like working with us. Here are the questions we ask our interviewer after this interview: 1. How knowledgeable was the interviewee about technologies used during pairing? (1–5, explain) 2. How comfortable was the interviewee with exploring the codebase and understanding the design? (1–5, explain) 3. How comfortable was the interviewee with writing code? (1–5, explain) 4. Did the interviewee write tests and consider the design before writing implementation; alternatively, did the interviewee attempt to understand the scope of impact of their changes before writing code? (1–5, explain) 5. Would you be overjoyed to pair with the interviewee tomorrow? (Yes/No) If the interviewer does not want to move forward after the interview, the process is stopped here. #### Interview Step 3: Half Day of Pairing > Welcome to the final step of the Voom interview process! > > This 3-hour interview allows you to actually see what a day-in-the-life at Voom would be like. You’ll have access to our code and backlog. We believe in our work and in our process… let us show it to you! In exchange, this gives us the opportunity to see what it would be like to pair with you. We will be focusing on your comfort with TDD and pair programming, and how well you can use your existing knowledge of Rails & React to be productive in a new codebase. > > Our day starts at 9:30 with Stand Up, so please arrive at ~9:15 so we can get everything set up. You will pair until 12:30. Before you arrive, please sign our NDA … This is the main event. A 1/2 day of pairing; a mini-apprenticeship. Would you buy a car without a test drive? Why would you accept a job without one? * You not only get to see out codebase, but also our processes: do we *really* follow good agile practices? Are our stories really reasonable and small? What are the Product Managers and Designers like? * Often times, the work done in the interview gets committed. Good pairing enforces quality, even with total strangers! * We do cherry-pick stories to find one that can be quickly completed and won’t involve too much domain knowledge, but an important principle is that the story for the interview really must be providing value to the company… just like any other day. We use the same interview feedback questions as in Step 2. Those are the attributes we value in a new team member. ![](/images/medium-export/1__lSwPi__0WBlfERWrly6VpRw.jpeg) ### What can we do better? In the spirit of transparency, here are some things we are working on to improve our process: * **Increasing top-of-funnel diversity**. Lots has been written about how the tech world is overwhelmed by white men (I’m one of them), and how this is bad. Diversity matters to Voom, *specially* since we are building a product for an international audience. While ~50% of the entire product team identifies as female or an underrepresented ethnicity (for the USA), the engineering team is only 15% female. Opening up our senior positions to remote team members has helped, and we will be exploring this more this year. Unfortunately, we are unable to provide visas or immigration support at this time, so our remote opportunities are still limited to US citizens. To date, we’ve engaged 2 diversity-specific recruiting firms, with poor results, so we’ve stopped working with them. * **Getting more members of the team involved**. We want as many members of the team to meet each candidate as possible (and vice-versa)To date, only about 30% of our team has been involved in the interview process. Not only does this introduce potential bias, but it also introduces bottlenecks! When we were smaller, we ended the 1/2 day pairing interview with a team lunch the candidate was invited to, but now that we are bigger, we haven’t found the time. We are working to re-introduce opportunities for candidates to meet more of the team as the final phase of the interview. * **Providing Critical Feedback**: While our HR policies prohibit us from sharing why we reject candidates, we feel a responsibility to the programming community to provide feedback to anyone who asks for it. We are actively working to revise this policy so that we can not only share feedback to interviewees, but also provide guidance on how they can improve for future interviews. ![](/images/medium-export/1__k0fZ____UZvex4lOGIiWaSvQ.jpeg) ### Meta You will note that in none of our interview steps do we require homework. Why ask you to do "example" work when we can see what it looks like when you are working on a real story? You will also note that in none of the interviews do we require white-boarding. Voom isn’t creating a new database, and like most small to mid-sized companies, scale is not our problem… growth is. We care more about your ability to quickly and safely implement a new feature or bring in a new library than your knowledge of how a B-Tree works. If you need it, Google it. We have a bias to hiring quickly. We can do this because we pair 100% of the time. Many great things have been written about the [benefits of Pair Programming](https://tuple.app/pair-programming-guide/the-case-for-pair-programming) ([we have too](https://blog.voom.flights/pair-programming-at-voom-how-our-code-takes-flight-423a05d63718)), but there are hiring benefits as well: it’s easier to accept the *risk* of a new employee when you can be sure that they will be consistently mentored. There’s no normal workday scenario in which a new employee will be working alone and merge in dangerous code if they are always working as a pair. You can also be sure that your workplace culture will be maintained and your rituals observed: pairing includes a built-in guide to how standup works, when to go home, and where the best nearby sandwich shops are. On-boarding is simpler too: just have a seat at one of the [pre-configured pairing stations](https://blog.voom.flights/our-team-workstation-at-voom-continuous-improvement-a1cf35ec1a43) and get to work! With the lower operational cost of a new employee, that helps us to take more risks on new employees: our hiring can become "*who do we want to take a chance on*" vs "*who can we already trust*". It helps remove bias from the hiring process and lets us tap into a more diverse pool of potential candidates. If you think Voom has an interesting hiring process, why not try it out? [We have open engineering roles in our Seattle office](https://www.voom.flights/careers), and some roles are remote-friendly! --- --- url: /blog/post/2018-10-09-tips-for-building-international-products.md description: >- Voom is an Airbus subsidiary working to make booking a helicopter as easy as booking a car… while we build out the infrastructure for the… --- ![](/images/medium-export/1__kPialmqAJFyZG3FCt1JC__A.jpeg) [Voom](https://www.voom.flights) is an Airbus subsidiary working to make booking a helicopter as easy as booking a car… while we build out the infrastructure for the [next wave](https://www.airbus.com/newsroom/press-releases/en/2018/02/vahana--the-self-piloted--evtol-aircraft-from-a--by-airbus--succ.html) of [vehicles](http://www.airbus.com/newsroom/press-releases/en/2017/10/cityairbus-demonstrator-passes-major-propulsion-testing-mileston.html) which will fly over our cities. Voom is live in both São Paulo and Mexico City, with more cities coming online soon. As a product and engineering team based in the US, we have been focused on building a product for international markets from day one. We’ve spent a lot of time thinking about the best way to build, what Voom CEO [Uma Subramanian](https://medium.com/u/67c0bd91af95), calls a "born-global" business. Here are are few of the principals we’ve adopted along the way. ![](/images/medium-export/1__TCItymB9p5RGrh6SCGL7tQ.png) #### Hire Locals and Empower them ![](/images/medium-export/1__06TUsveOdPpB__OiS1EyLKw.jpeg) This point cannot be repeated often enough: *Hire a team in-market and structure your company in such a way that each team is empowered to act locally.* "Empower" to Voom means creating a structure in which each country is responsible for their own success, and then equipping them with a budget and team that can make that happen. There is so much regional knowledge required to build a B2C business — from knowing the local language, traditions, and customs as well as the local policies and regulations, to being aware of weather conditions in real time to recognizing that the country effectively shuts down during their country’s World Cup games ⚽️. Rather than enumerate all the reasons for structuring your company this way (proper brand voice, local knowledge of culture, events, and restrictions, proper time zone and holiday support, regulatory know-how, linguistic help…), I thought it might be powerful to share some anecdotes about things our US-based Product Team would have certainly gotten *wrong* if we didn’t work closely with our Brazilian and Mexican colleagues: * Did you know there is no word in Portuguese for "commute"? Imagine if Voom’s marketing focused on having an easier time getting to work… * In Brazil, sharing your tax ID (CPF Number) is nothing like sharing a US Social Security Number. It’s common, and it’s odd if you *don’t* ask for it. * The word "Rider" (what we call Voom customers) translates literally to "cowboy" in Brazil. * You can’t charge baggage fees in Mexico. It’s against the law! * Brazilian Valentines Day, "[Dia dos Namorados](https://www.thesun.co.uk/fabulous/6510262/dia-dos-namorados-2018-brazils-lovers-day-celebrated/)", is celebrated in June, not February. Offering romantic scenic flights in February would have been quite odd. * When sending a text message to Mexico from outside the country, there’s an extra "1" you need to dial. … and there are countless more. Another thing that proved to be important is to have a consistent local voice when it comes to translation. For us, that is our local Marketing Manager. We want to be sure that similar nuance and idioms are used throughout our product and marketing. To make this process simple, we are investing in tools that allow the product team to share a translation to-do list, the English versions, and screen shots of where each phrase will appear in the products. #### Don’t lock Country to Language or Currency Imagine you are a Mexican traveler, at home in Mexico City, planning a trip to São Paulo for work. What language should you see on the Voom website… Portuguese or Spanish? What currency should you see prices in? What timezone should your boarding time options be displayed in? Initially, when we launched Voom, we made the choice that we should focus the Rider’s experience on "Voom’s closest active region to where they currently are." However, we learned that over 1/3 of Voom’s customers are international travelers so that quickly proved to be the wrong approach. Our Mexican traveler would have seen helipads in her city, rather than in São Paulo, and her flight times would be in the wrong timezone! **Below are the the product principals we use now to address the aforementioned issue:** * The language of the customer **experience is set based on the user’s device preference** (operating system language and browser ) regardless of the country we think they are in or traveling to. * Our customers are international travelers. We don’t know enough information to guess where they want to go, **so we should ask them explicitly.** * Once a Rider selects a region for their flight, everything (other than language) should be **oriented to the region**: including time zones, prices, and currency. #### Political Correctness is Regional ![](/images/medium-export/1__ZyCXgINjVPTrxiBj6oUWeQ.jpeg) When you fly on a helicopter, many countries have regulations requiring that we capture some information about you, in addition to checking your photo ID at the gate. These include weight, age, and gender. In the United States, when collecting gender, offering a third non-binary option (or better yet, a [free-text](https://www.hrc.org/resources/collecting-transgender-inclusive-gender-data-in-workplace-and-other-surveys) field) is best practice. We conducted research to see how our customers in Brazil and Mexico would react, and the result was surprising to me — someone based on the US west-coast. Even for those Latin American people who may identify with a non-binary gender in private, many would not choose such an option on any public form. We were even told that offering the option may be offensive to some of our customers and have a negative impact on our business. With that in mind, but still wanting to offer an inclusive third option, we chose "prefer not to state," and worked with our legal team to ensure that this would be an acceptable option. Again, be sure that your local teams can weigh in on your product choices. #### Moving Money is Hard, and has Product implications ![](/images/medium-export/1__PTaEY__0nSRVYrS4ZwjuPHQ.jpeg) Some countries, like the US, Canada, and Mexico have a fairly easy flow of currency between the two (with taxes…), and you can run your international business with a single bank account and related payment vendors. However, there are other countries with strict currency controls, which make it incredibly hard to move money in or out of the country. These countries include Brazil and China. Voom uses [Stripe](https://stripe.com/) to process credit cards in our regions. However, because of the currency controls present in some of the countries in which we operate, we couldn’t use a simple Stripe configuration. We needed a bank account, accounting, and Stripe account in every country Voom operates in. There are implications for this in our product: * If you were to save a credit card on your Voom account in Brazil, you would need to save it again when you fly in Mexico. It wouldn’t be saved to your "Voom Account" in general, since it is authorized with a separate merchant account in each country. * If we create a coupon or promotion, it has to be valid in only one country and currency. * If you are flying with Voom credit or a Company Account, which countries can you fly in and in which country do we apply tax to? The answer depends on which country you are based in. When building your international product, the legal and tax landscape is more complex than you might initially guess so be sure to leave the time to build the nuanced tools you will need! #### Establish a "Corporate" Timezone and Language, and ensure that everyone knows how to use it ![](/images/medium-export/1__aVzJTznRRfP1lM7AXe9yLw.jpeg) When your company looks at metrics and establishes KPIs, when does the month start and end? When your company drafts internal policy documents (like an Employee Handbook), which language do you use? This questions may seem trivial now, but as you hire more employees in more countries, you’ll need to be sure that the everyone can communicate. You will need each employee to have a baseline competency in your company’s chosen language. This is a hiring requirement. When setting up reporting tools, you will either need to lock in a reporting time at the beginning (for example, Google Analytics requires this), or you will need to be sure that your tools allow flexible time zone reporting (we use [Looker](https://looker.com/) at Voom and this is something the tool does well). We chose [UTC](https://en.wikipedia.org/wiki/Coordinated_Universal_Time) for our reporting timezone and English as our internal language. Coincidentally, English is also the "official" internal language of Airbus at-large. We are always learning about the best ways to build a small international company. [If this type of work is interesting to you, we are hiring](https://www.voom.flights/careers)! --- --- url: /blog/post/2016-10-11-too-many-chrome-tabs.md description: 'Here’s a fun bug report:' --- Here’s a fun bug report: ![](/images/medium-export/1__QX3Wun3NN__374jHq__VsngA.png) After talking further with @bluesunrise, it turns out that this error only appeared on his development machine, and even more specifically, only when the Crome browser was open! This is a problem with the [node-resque](https://github.com/taskrabbit/node-resque) project, which, among other things, is used by [ActionHero](http://www.actionherojs.com/) to enqueue and work background tasks. One of the things Resque does is that on boot, it asks Redis (the backing store for this data) what workers it thinks are are running on this host. We do this to check to see if any old workers have crashed while working on a job… and if they have, we: * Move the job they were working on into an "error” list for introspection * Remove the old crashed worker process from Redis To check which workers this host can manage, all workers have the system’s "hostname” saved, and we can look at what PID they were running as. If that PID no longer exists on this host, we can assume the worker has crashed and clean up the data as described above. This means our Node.JS process needs to check on all the running PIDs on my system. Here’s how we **used** to do it (simplified): ```js worker.prototype.getPids = function(callback){ var cmd = 'ps awx'; var child = exec(cmd, function(error, stdout, stderr){ var pids = []; stdout.split('\n').forEach(function(line){ line = line.trim(); if(line.length > 0){ var pid = parseInt(line.split(' ')[0]); if(!isNaN(pid)){ pids.push(pid); } } }); if(!error && stderr){ error = stderr; } callback(error, pids); }); }); }; ``` Check out **ps awx**. We are asking the OS for the whole process list, and then extracting all the PIDs… which does accomplish our goal. To compare, check out how Ruby’s Resque does the same job: ```ruby def linux_worker_pids `ps -A -o pid,command | grep -E "[r]esque:work|[r]esque:\sStarting|[r]esque-[0-9]" | grep -v "resque-web"`.split("\n").map do |line| line.split(' ')[0] end end ``` Ruby has the luxury of *knowing* that the name of the process running this application will be called "Resque”. However, for the Node.JS version, it might be called "node”, but it also might be called "electron”, or "iojs”. Since we can’t be sure of the process name, this means we need to look at *all* processes. When you look at *all* processes on a system, there might be a lot of them… I learned that @bluesunrise had a \*lot\* of tabs open in Chrome. Every tab in counts as a process. Also, the process list contains a lot of data: the PID, the name, the path, etc. After about 10,000 characters, Node.JS’ Buffers start to get full, and apparently in some cases, crash. So now we know the source of the problem, how do we fix it? Since Node.JS’s parsing of the large string returned by the sub-process was the problem, can we off-load this work? We can! ```js var cmd = "ps -ef | awk '{print $2}'"; ``` Here, rather than load in *all* the data from **ps**, we are using AWK to return only the process IDs. AWK is a safe choice, because It is part of the kernel, and thus available on all unix/Linux/OSX distributions. This returns a far shorter string back to Node.JS to parse. Hooray! --- --- url: /blog/post/2012-02-07-give-travisci-money.md description: >- Travis CI is raising money to keep their free community continues integration server up and running. This is an awesome service which I use… --- ![](/images/medium-export/1__u4zG6GFiZMoQd__LfW23W0g.png) [Travis CI](http://travis-ci.org/) is raising money to keep their free community continues integration server up and running. This is an awesome service which I use to test my node projects. > *I’m lazy. Without Travis, I wouldn’t test. Travis makes it easy to test. Testing is good. Q.E.D. Travis is Good / Throws money at screen* #### [Help them out here >>](https://love.travis-ci.org/) --- --- url: /blog/post/2022-01-06-typescript-types-from-class-properties.md description: Use TypeScript to compute types from complex Objects --- ![TS Logo](/images/posts/2022-01-06-typescript-types-from-class-properties/220106-ts-types.png) At Grouparoo, we use a lot of [TypeScript](https://www.typescriptlang.org). We are always striving to enhance our usage of strong TypeScript types to make better software, and to make it easier to develop Grouparoo. Strong types make it easy for team members to get quick validation about new code, and see hints and tips in their IDEs - a double win! Recently, I found myself repeating a lot of metadata when defining a new API endpoint as I was working to enable [`noImplicitAny`](https://www.typescriptlang.org/tsconfig#noImplicitAny) within the `@grouparoo/core` project. We use [Actionhero](https://www.actionhero.com) to build Grouparoo, and so a typical Action might look like: ```ts import { Action } from "actionhero"; export class TestAction extends Action { constructor() { super(); this.name = "testAction"; this.description = "I am a test"; this.inputs = { key: { required: true, formatter: stringFormatter, validator: stringValidator, }, value: { required: true, formatter: integerFormatter, }, }; } // <--- Note the type definition below for `params` async run({ params }: { params: { key: string; value: string } }) { return { key: params.key, value: params.value }; } } function stringFormatter(s: unknown) { return String(s); } function integerFormatter(s: unknown) { return parseInt(String(s)); } function stringValidator(s: string) { if (s.length < 3) { throw new Error("inputs should be at least 3 letters long"); } } ``` Notice how the params provided back to the `run()` method are typed, even though we provide that information functionally via the `formatter` argument to the Action's inputs. Defining this information in both locations was tedious, and more nefariously, a possible place for drift between the implementation and the types. What would it take for TypeScript to automatically be able to determine the types of our Params? ## Learning Things I tried many approaches to programmatically determine the types of an Action's params, and learned a lot along the way. The most interesting thing that I learned was that method argument types are not inherited in TypeScript. Initially, I wanted to modify the abstract base `Action` class to automatically reflect its input types into the run method, but it's not possible: Consider the following: ```ts abstract class Greeter { abstract greet(who: string, message: string): void; } class ClassyGreeter extends Greeter { greet(who, message) { console.log(`Salutations, ${who}. ${message}`); } } const classyGreeterInstance = new ClassyGreeter(); classyGreeterInstance.greet("Mr Bingley", "Is it not a fine day?"); // OK, inputs are strings classyGreeterInstance.greet(1234, false); // Should throw... but it doesn't! ``` Even though `ClassyGreeter` extends `Greeter`, the fact that the `greet()` method is re-implemented means that the initial type of the method from the abstract class can't be assumed. After hitting that dead end, I pivoted to attempt to build a transformation utility type. While working on this, I found myself inspecting the properties of the `Action` class in question, and I learned was that TypeScript doesn't *really* know what goes on in a Class constructor. For example, you can define the same class both ways: ```ts class ConstructedList { items: string[]; constructor() { this.items = ["apple", "banana"]; } } class StaticList { items: ["apple", "banana"]; } typeof ConstructedList().items; // string[] typeof StaticList().items; // ['apple', 'banana'] ``` At runtime, these 2 classes will have the same behavior, with `this.items = ["apple", "banana"]`, but because the class property was defined strictly in `StaticList`, we can get the literal types back, rather than just the "string\[]" we get from `ConstructedList`. ![TypeHints](/images/posts/2022-01-06-typescript-types-from-class-properties/hints.png) ## The `ParamsFrom` Type Utility Knowing the above, it became clear that to reach the goal, I would need to reformat all of our action definitions to *not* use a constructor. After that, TypeScript can start to inspect the properties of the class. Our utility can take in the Action's class as an argument, and inspect both the keys of the `inputs`, and if there is a `formatter` present, infer its return type: ```ts export type ParamsFrom = { [Input in keyof A["inputs"]]: A["inputs"][Input]["formatter"] extends ( ...ags: any[] ) => any ? ReturnType : string; }; ``` Of note, because we are accepting data over HTTP or websocket most commonly, we can assume that an input without a formatter is a string. Putting everything together, here's what our final Action looks like: ```ts import { Action, ParamsFrom } from "actionhero"; export class TestAction extends Action { name = "testAction"; description = "I am a test"; inputs = { key: { required: true, formatter: stringFormatter, validator: stringValidator, }, value: { required: true, formatter: integerFormatter, }, }; async run({ params }: { params: ParamsFrom) { return { key: params.key, value: params.value }; } } function stringFormatter(s: unknown) { return String(s); } function integerFormatter(s: unknown) { return parseInt(String(s)); } function stringValidator(s: string) { if (s.length < 3) { throw new Error("inputs should be at least 3 letters long"); } } ``` And finally, we can see our params are typed: ![TypeHints](/images/posts/2022-01-06-typescript-types-from-class-properties/final-types.png) ## Open Source Contribution We contributed this work back to Actionhero, and in [Actionhero v28.1.0](https://github.com/actionhero/actionhero/releases/tag/v28.1.0), the `ParamsFrom` utility is included! --- --- url: /blog/post/2012-01-27-unicorns.md description: >- At ModCloth, our website uses the Ruby on Rails framework, and we use the unicorn web server. When deploying code via Capistrano, we have a… --- ![](/images/medium-export/1__UQMJrkiUtHbA2W__e3__HSYg.jpeg) At [ModCloth](http://www.modcloth.com/), our website uses the [Ruby on Rails](http://rubyonrails.org/) framework, and we use the [unicorn web server](http://unicorn.bogomips.org/). When deploying code via [Capistrano](http://capify.org), we have a funny line of code which turns off our old web-servers and starts up new ones. Every time we deploy, we must kill a Unicorn. This inspired a bit of artwork. ![](/images/medium-export/1__tOIT5iVAI0So8ruY5Z0xYg.jpeg) Thanks [Ali Spagnola](http://alispagnola.com/) and her [Free Painting project](http://www.alispagnola.com/Free/)! --- --- url: /blog/post/2012-02-01-unicorns-in-node.md description: >- At work the other day, an engineer most familiar with Ruby asked "What is the equivalent to Unicorn in Node?". Unicorn is a great… --- At [work](http://www.taskrabbit.com) the other day, an engineer most familiar with Ruby asked "What is the equivalent to [Unicorn](http://unicorn.bogomips.org/) in Node?". Unicorn is a great single-threaded server for ruby apps (Sinatra, Rails, etc) which implements a parent-child cluster of workers to share requests. However, as node can handle more than one request in parallel, the metaphor gets a little strange. ![](/images/misc/unicorndies.jpg) This question becomes important when deploying a production app, it’s most cost-effective to use 100% of your resources. That means in both Node and Unicorn, you want a "child" for each CPU you have, assuming you don’t run our RAM. However, you might want even more if your application spends significant time waiting for another server (perhaps a DB). For example, say you have a website which spends 1/2 of its time doing "CPU-bound tasks" (like rendering HTML) and the other 1/2 of its time fetching information from the database. If you have a 4-core server, you would want ~8 workers to take maximum advantage of the processors you have (1). In Ruby/Rails, this would mean that at most you could handle 8 simultaneous requests, but what about node? Once again, the question gets a little strange. We know that our ruby app is single-threaded and single request-ed. This means that no matter which ORM or framework we use, only one request can be happening at a time (2). In an equivalent application, Node will still take 1/2 the time per request to process those HTML templates, but we have the opportunity to *stack* requests when the CPU is idle waiting for the database. We can have any number of requests "pending" while waiting for the database (3). This, theoretically, can greatly increase your application’s throughput. Unicorn, when running, handles a request like this: port -> master -> child. The request is always revived on the same port or socket, and the master process routes the request to a child. The master process can spawn or kill children, reboot them, and otherwise manage them. We can make something like this easily in node with the [cluster module](http://nodejs.org/api/cluster.html), however there are only *some types of applications*, mainly those that are CPU bound (or using an external service that is CPU bound) which would benefit from a cluster in practice. Take the simple example webserver from the nodejs.org website: ```js var http = require("http"); http .createServer(function (req, res) { res.writeHead(200, { "Content-Type": "text/plain" }); res.end("Hello World\n"); }) .listen(1337, "127.0.0.1"); console.log("Server running at http://127.0.0.1:1337/"); ``` This server isn’t doing much, and can probably handle thousands of connections at a time. Lets "simulate" a slower request that takes 1 second: ```js var http = require("http"); var handleRequest = function (req, res) { setTimeout(function () { res.writeHead(200, { "Content-Type": "text/plain" }); res.end("Hello World\n"); }, 1000); }; http .createServer(function (req, res) { handleRequest(req, res); }) .listen(1337, "127.0.0.1"); console.log("Server running at http://127.0.0.1:1337/"); ``` Now, even though a client will see the response after 1 second, the server still isn’t doing much. Requests are collected and stored, so more RAM will be used, but there isn’t any real computation happening. I’ll bet this server can still handle thousands of simultaneous requests. You can test this out. Make 10 requests with curl and time them (time curl localhost:1337) as fast as you can, and you will notice that all of them only take 1 second to complete. What we need to do now is to simulate a "blocking" sleep, which means that we need to engage the CPU the node process is using and block it. Keep in mind this is a terrible idea and should never be done in real life: ```js var http = require("http"); var handleRequest = function (req, res) { var startTime = new Date().getTime(); var sleepDuration = 1000; while (startTime + sleepDuration > new Date().getTime()) {} res.writeHead(200, { "Content-Type": "text/plain" }); res.end("Hello World\n"); }; http .createServer(function (req, res) { handleRequest(req, res); }) .listen(1337, "127.0.0.1"); console.log("Server running at http://127.0.0.1:1337/"); ``` Note how we use a loop which doesn’t exit until enough time has passed to "block" the CPU. Now we know for a fact that our application will only handle one request per second (and use a whole CPU core to do it). If you now make 10 requests with curl and time them you will notice that the requests stack and take longer. The first request will take 1 second as before, but if you start the requests all at the same time, the second request will take 2 seconds, the third request 3, etc… Now we have an application which will benefit from cluster! If we can launch 10 parallel instance of this server at once, we can go back to handling 10 requests in 1 second (4). ```js var http = require("http"); var cluster = require("cluster"); var desiredWorkers = 10; var log = function (message) { console.log("[" + process.pid + "] " + message); }; var handleRequest = function (req, res) { var startTime = new Date().getTime(); var sleepDuration = 1000; while (startTime + sleepDuration > new Date().getTime()) {} res.writeHead(200, { "Content-Type": "text/plain" }); res.end("Hello World\n"); log("sent message in " + (new Date().getTime() - startTime) + " ms"); }; var masterSetup = function () { for (var i = 0; i < desiredWorkers; i++) { cluster.fork(); } log("cluster booted!"); }; var childSetup = function () { http .createServer(function (req, res) { handleRequest(req, res); }) .listen(1337, "127.0.0.1"); log("Server running at http://127.0.0.1:1337/"); }; if (cluster.isMaster) { masterSetup(); } else { childSetup(); } ``` Note how we added a logger which shows the pid of the process saying the message, and we can show that there are now 11 distinct processes running, 10 children and a parent. As a bonus, you can look at how to instrument a cluster implementation like this signals, so you can tell the master process to add or remove children, reboot them, etc by looking at the [actionhero cluster module](https://github.com/evantahler/actionhero/blob/master/bin/methods/startCluster.js) There are always limits to how many requests an application can handle at time, even with the simplest example here. However, the benefits of parallelism increase substantially with CPU-bound workloads. Footnotes: 1. In reality you probably want 7 workers on an 4-core system, to leave some flex space 2. Sinatra and Rails follow this, EventMachine is the exception in Ruby frameworks 3. It’s very possible to overload a database with too many simultaneous requests, like any server. This is why most database adapters (and some ORMs) operate a connection pool which will limit the number of requests it will make at a time. Subsequent requests are queued in-app. 4. This really only works if you have a computer with 10+ CPUs, but you get the idea. --- --- url: /blog/post/2023-04-28-upgrading-community-prs.md description: Getting the Airbyte Community involved --- ![Connector release stages](/images/posts/2023-04-28-upgrading-community-prs/image.jpg) Hello Airbyte Community! I am @evantahler, and I am an Engineering Manager at Airbyte, and part of the Connectors Organization. We wanted to share an update with you about how we are thinking about Community Contributions, your [Pull Requests](https://github.com/airbytehq/airbyte/pulls?q=is%3Aopen+is%3Apr+label%3Acommunity), in 2023. First, we know that we have not been as responsive as we would like to be - you are creating PRs faster than we can review and merge them! This speaks volumes about the strength of the Airbyte community, how simple it is to enhance our existing connectors, and how easy we’ve made it contribute new connectors. Thank you for your willingness to aid your fellow Data Engineers and make Airbyte the largest collection of data integrations! You enable us to make progress on our mission - to make data available and actionable to everyone, everywhere. Let’s take a look at why we aren’t able to review and merge your contributions with the speed we would like, and then, what we are going to do about it. ## Connector Contribution Problems we are going to solve in 2023 **Testing a PR requires manual intervention** We want all the connectors in our [repository](https://github.com/airbytehq/airbyte) to be well-tested and adhere to certain quality standards, especially as they move up the [Connector Certification Stages](https://airbyte.com/blog/connector-release-stages). This requires running integration tests against our test sandbox accounts or sample databases. Because we will be passing in real API keys, this requires a manual review to kick off the tests in our continuous integration environment. We also lack an automated way to share these same credentials with contributors so they can test things out locally. Airbyte connectors are unlike other types of software where you can mock external API calls in tests - that’s the whole point of a connector! We want to ensure that the connector still works as expected, and the upstream API hasn’t changed either. **The requirements to publish a new version of a connector are opaque** To provide a good experience for our users, we want every connector (and new version of a connector) to have a good changelog, adhere to [semver](https://docs.airbyte.com/contributing-to-airbyte/#semantic-versioning-for-connectors) for breaking changes, and include additional metadata (like docs, icons, etc). This process is not well documented right now, and often requires an Airbyte team member to add this information or go back-and-forth with the contributor to include this information. **Our complex connectors are tightly coupled** Some of our most complex connectors (those that deal with databases and files) share a lot of code. This simplifies *our* development and maintenance of these connectors, but it creates an environment where changing one file, say for source-mysql might also affect how source-postgres works. This makes reviewing these changes difficult and contributing in these areas extra tricky. **Airbyte supports all of our connectors for Airbyte Cloud users** Today, for users of Airbyte Cloud, we provide support if something goes wrong on every sync. As we add connectors from the community that we aren’t familiar with, and possibly don’t have access to, the burden on our support team grows. We currently don’t merge connectors that connect to an API or Database which we can’t test ourselves. ## What are we doing about it? This year, we will be prioritizing the following projects: * **Allow the team to focus on community contributions by creating self-serve support tools** * **Rethinking how we share secrets and run tests for contributors** * **Revamping our Connector Metadata system, and making clear requirements and tests** * **Adding CDKs (connector development kits) for our file-based sources and decoupling Java database connectors** * **Creating a way for contributors to provide “community-supported” connectors** On the first topic, you may have seen some new tools and Slack channels in the Airbyte Community Slack and Discourse ([airbyte.com/community](https://airbyte.com/community])). Our #airbyte-help channel has been divided into 5 channels, each focused on a specific topic. This enables our community to easily locate and search for previous help threads. Additionally, we have introduced Kapa.ai, a ChatGPT bot, to assist us in responding to queries from our expanding community. This will free up our team of Technical Support Engineers to get back to working with you on your code. We don’t have anything to share publicly yet on the remaining projects listed above, but stay tuned! As we roll out each of these features, we will be sure to share them with you. ## What to do in the Short Term? While we are working on making the developer experience better, which will take some time, we wanted to share guidance about how to contribute in a way that we will be the most likely to accept. 1. **Focus on enhancing existing popular (Generally Available) connectors‍** Since we already have tests and a sandbox environment set up for our existing connectors, testing enhancements to them is the easiest thing for us to do. We are always working to speed up and enhance our most popular connectors, and since we are working on them ourselves, we will be more likely to address your contribution in a timely manner. Be sure to add tests for any new functionality you add! 2. **Create new connectors using the low-code CDK** If you find yourself making a new API Source connector, please consider using our low-code CDK. We’ve built a way for you to build API sources using only YAML! This means that there’s almost no code for us to review which makes the review process very quick! Stay tuned for an update in the next few months about how we’ve made making low-code connectors even easier. 3. **Let us make the changes for you** Finally, please grant us permission to correct small mistakes and add metadata in your PRs. If you [“Allow Edits from Maintainers”](https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/allowing-changes-to-a-pull-request-branch-created-from-a-fork#enabling-repository-maintainer-permissions-on-existing-pull-requests) in your PRs, rather than go back-and-forth with you, we can just make the changes ourselves 😀 This doesn’t mean that we won’t review and accept PRs in other areas, but for now, this is where we will be focusing our efforts as we build the enhancements to our tools and processes listed above. **Thank you for building Airbyte with us.** --- --- url: /blog/post/2019-09-13-using-next-js-for-static-sites.md description: >- Use tools like Circle.Ci, Github, and Next.JS to host production-grade websites for free! --- ![](/images/medium-export/1__SWbmOATNSu__AnPWW__DKeLw.jpeg) On Thursday, September 12, 2019 I gave a talk at [Seattle.js](https://twitter.com/seattlejs/) about how to use some of the free tools in the Javascript & open-source ecosystems to host static website. This talk was inspired by a group of students learning to code in Seattle who were being taught tools like React and Angular, but struggling to learn how to deploy their sites using modern methods. Specifically, how to set up CI/CD (Continuous Integration + Continuous Deployment) and HTTPS. Though my work on [ActionHero,](http://www.actionherojs.com) an open source project, we’ve come up with some best practices to accomplish all of this for free! Thanks to the free tools from Github, CircleCI and Zeit, it’s pretty easy to get production-grade infrastructure for smaller projects… for free! The repository for the talk can be found here: --- --- url: /blog/post/2012-11-24-what-to-do-when-softpedia-scrapes-your-project.md description: Today I woke up to learn that I had made it big! --- ![](/images/medium-export/1__h2gRKouxzuquwx2gPUb6zA.jpeg) #### Today I woke up to learn that I had made it big! That’s right, today I have been ‘inducted’ into the annals of internet history as [some of my open-source projects have been crawled into Softpedia](http://webscripts.softpedia.com/developer/Evan-Tahler-1869788142.html). For those of you who don’t know, [Softpedia](http://www.softpedia.com/) is a website which collects links to various software programs, categorizes them, and offers links to downloads. The "Free Downloads Encyclopedia they call themselves. I know that at some time in my past I must have used Softpedia to find an obscure video codec for VLC or a "free file converter" for that totally legitimate torrent that I downloaded, but since I have moved to OSX (about 6 years ago) I hadn’t used sites like Softpedia very often. A rush of nostalgia rushed over me. However, after clicking around some of the project pages, I began to get frustrated. First off, I couldn’t seem to actually download any of the projects. Every download button I clicked took me to another webiste to download a windows-only "download manager" (Google it if you are curious, I don’t want to give them any more traffic). After a few failed download attempts, I finally realized that Softpedia was actually doing the noble thing and linking visitors to GitHub’s zip of the master branch (albiet via 3 download confirmation pages), but every ad on the page (and there are lots of them) was for a ‘misleading’ download link. I struggled with what to do next. In the spirit of open source ([and my overly-lenient license](https://raw.github.com/evantahler/actionHero/master/license.txt)) I didn’t want stop Softpedia from referencing my projects. Hey, any press is good press right? They also did a good job of categorizing actionHero properly and they seem to be quite good at SEO. I had initially planned to ask them to take down my projects back when I thought all the download links were fraudulent, but I reworded my letter to be the following: > *Hello -* > > *I see that you have started to reference a number of my projects on softwpdia \[\[* [*http://webscripts.softpedia.com/developer/Evan-Tahler->*](http://webscripts.softpedia.com/developer/Evan-Tahler-%3E) *1869788142.html ]].* > > *In the spirit of open source, rather than ask you to remove the links (which I reserve the right to do in the future), I would simply ask that you clarify that all the download links to take the visitor the appropriate github page (IE the download link for actionHero \[\[* [*http://webscripts.softpedia.com/script/Development-Tools/Frameworks/actionHero-76758.html*](http://webscripts.softpedia.com/script/Development-Tools/Frameworks/actionHero-76758.html) *]] should link here \[\[* [*https://github.>*](https://github.%3E) *com/evantahler/actionHero ]] ).* > > *By download links, I include all the ‘ads’ which surround the download page with ‘misleading’ downloads and download managers. I understand that advertising is likely to be a large portion of your revenue, but I only ask that you not mislead visitors who are looking to find my software. I would prefer to not be associated with malware/adware. Many of those links are also confusing as they feature ‘windows only’ download managers for software you have (properly categorized) for many operating systems including OSX, Linux, and Solaris.* > > *I have attached a screen shot of the download page indicating ‘misleading’ links in red, and the ‘real’ link in green. In this way, you still can offer landing pages for my software and don’t need to host any of the downloads. This will offer the user a better experience (and legitimate) software to your visitors.* ![](/images/medium-export/0__RV96A9P7Vg6yBh8X.png) I think this is reasonable. I don’t want to stop them from doing what they do best, but I really don’t want these projects to be associated with malware. What would you do? I’ll keep updating here as I hear back from them. ### Softpedia responded on November 26th > Hello, Thank you for getting in touch. > > We understand your concern, but unfortunately we are unable to guarantee that a ‘download manager’ ad will never show up on the download pages of your products. Sorry about that. > > Note that we do try and block them via our Adsense account but new ones show up. Additionally, some ads use behavioural targeting so any block on our side would be ineffective. > > We can change the target of the download link to be your Github page, let us know if that will be satisfactory. > > We would love to keep listing your software, but we will understand if you would prefer to have it removed ### My response > Thanks for the response- > > I understand that it might be hard to filter out those types of ads, and it is good to hear that you are trying to block the ‘confusing’ ones. Please change the download links to the github page for the project (IE: [https://github.com/evantahler/actionHero),](https://github.com/evantahler/actionHero%29,) and feel free to keep your pages live for now. --- --- url: /blog/post/2013-03-19-when-does-api-first-not-apply.md description: When should you not use an API-First methodology? --- In this post, I will explore some of the downsides of API-First development, and some of situations where developing in an API-First way may not be appropriate. ![](/images/medium-export/1__Df3z03ju7EPaTPnzqtCavw.jpeg) ### I’m only making a website. I don’t need this. The main "catch" of developing in an API-First way is that your first release will take longer. There is no way around it. There is additional overhead of creating 2 applications (your front end application and your backend API). There is also additional cognitive overhead for everyone on the team to think not only about their pice of the puzzle (engineering, design, project management, product management), but also how to express their work in the terms of the agreed-upon API. Doing all of this will slow you team down… at first. **If you believe that your project will ONLY ever have one expression, then perhaps API first is not for you**. However, there are caveats: * Even if your project will only ever have one front-end expression (website, iPhone) right now, you can never be sure what the future will hold. Building from an API to start is a valuable investment in your future. * Even if your project will only ever have one front-end expression, there are benefits to being able to work on your "server" separate from your "views". Do you wan to experiment with different UIs without worrying about risk to your servers and data? Do you want your engineers and designers to be able to develop in parallel without relying on each other? You still may want to develop API-first. * There are operational benefits to developing API first, mainly you can scale your infrastructure using only the parts you need. You can scale/distribute your front end without effecting your API servers. Currently, this takes the form of offloading all of your assets to something like Amazon s3 or Github Pages, and reducing the load on your servers, allowing you to do more with less. There are many application where API-First doesn’t apply. For example, if you are building a video game with no on line component for just the iPhone, certainly don’t waste time with externally facing APIs. But, if you think you might also port your game to the android one day, it may make sense to extract and modularize as much as you can. ### I’m a very small team. I can’t use this If you team is very small (1 developer, 1 designer) API-First Development may be too much overhead. This is a very valid concern! However, I have found that when I find myself in a group this small, we actually end up doing API-First development anyway! We probably didn’t agree on a formal API document beforehand, and we probably never had an inception, but we are doing it none-the-less… it’s simply a logical way to separate our work! If we would have taken the time to have an inception and flesh out the API ahead of time, often we would have saved ourselves some trouble when it came to integrate. Often times, even though the features match, the variable names or language don’t. Re-Work is required to fix it. ### I’m a design shop, and I don’t need to support this You might be surprised to learn that you are already using an API. Are you developing a WordPress theme? You are the front-end consumer of WordPress’ post and data APIs. Are you using a PaaS to host this site (like Heroku or AppFog)? If you are, you are using their storage and server APIs. They made their API with some assumptions of what their customers would do, and how they might choose to implement sites on their platform. This isn’t a direct use of API-First, as you are buying a "complete" product from them, but you are bound by the rules of their API. --- --- url: /blog/post/2021-05-14-varchar-191.md --- ![A Database symbol over a library](/images/posts/2021-05-14-varchar-191/210515-varchar191.png) Sometimes, when you are looking at a database’s schema, you see that there are text fields defined like this: ```sql email_address varchar(191) NOT NULL ``` This means that the column supports strings with a maximum length of 191 characters, and can’t be `null`. 191 is such an odd number - where did it come from? In this post, we’ll look at the historical reasons for the 191 character limit as a default in most relational databases. ## Why `varchar` and not `text`? The first question you might ask is why limit the length of the strings you can store in a database at all? All modern popular relational database support (almost) unlimited sized strings with a `text` or `blob`-type column, so why not use that? The reason is **indexes**. If you are going to search by a column, say `email_address`, you probably want to add an index to it to speed things up when you do the following: ```sql select id from users where email = 'foo@example.com'; ``` As your table gets bigger, searches get slower because your database has to check every row to find a match. However, if you add a **search index**, you are telling your database to essentially "pre-compute" popular search patterns with a tree so the next search is much faster. In essence, indexes spend computation time (and a little bit of disk space) making writes to the database slower, to speed up reads later. For most applications this is a great tradeoff, since they are "read heavy" and "write lite". So, why use `varchar`? Indexes can be made to perform better when assumptions can be made about the type of data they store. Knowing how long the strings in the index are is one of the best ways speed things up. For some databases, you aren’t allowed to add a search index to columns of type `text` because this optimization can't be done, while in others, the index just won’t perform as well. In fact, historically, databases were constructed with limits on how big an index could be to optimize search and how they stored data on disk. ## It’s MySQL’s fault Ok, so indexes are good. But, generically, it seems that an index of *any* size should work, and while that’s true today, it wasn't always possible. The next stop on our journey is to look at what the default column size was far in the past, and that was 255 characters, e.g.: ```sql email_address varchar(255) NOT NULL ``` MySQL, the most popular open source database of the early 2000s had a limit of 255 characters in indexed fields. The history is fuzzy as to why MySQL chose a 255 character limit (see the articles linked below), but the most popular theories include: * 256 is the largest number you can represent with an 8-bit integer. MySQL being very concerned with speed and memory usage, wanted to store things with the smallest possible data types. * MySQL was itself trying to be compatible with even older databases (sybase/SAP), and they had a 255 character limit. * MySQL wanted to ensure that its index files could fit within a single page block on older file systems. With a 256 character limit in mind, the MySQL developers felt comfortable further optimizing many parts of the database against that 255-character limit (more on this later). Since many popular open source application frameworks launched in that time period (Wordpress, Django, and Rails to name a few), they all followed MySQL’s defaults, even when they could run on multiple database types, like postgres. This formed a common default for most ORMs ([Object–relational mapping - Wikipedia](https://en.wikipedia.org/wiki/Object%E2%80%93relational_mapping)) to use `varchar(255)`, regardless of the database in use. ## It’s 🐟’s fault 255 makes a lot more sense than 191. How did we get to 191? I’m going to blame emoji 😜. Seriously. Well, `utf8mb4` at least, the character set that allows for "international" \[^1] characters, and included the first emoji. MySQL in the early 2000s was happy supporting 255 characters in `varchar` columns and indexing them. However, the the most popular text encoding (`Latin1` or `utf8` ) on the most popular MySQL database engine (`innodb`) assumed that 3 bytes was enough to store every character \[^2], and once `utf8mb4` came along with characters like 𠼭\[^3] and 🐟, 4 bytes were needed to store each character. There were [more character to choose from](https://www.fileformat.info/info/charset/UTF-8/list.htm), so referencing them took more bytes. The way `innodb` MySQL databases worked was that you can only have 767 bytes for an index - enough to store 255 3-byte characters (`767/3 = 255`). This is an extreme example of index optimization based on knowing the size of the data you are indexing! So if the characters took more space to store, then the number of characters you could index had to get smaller. Specifically, `767/4 = 191` characters! As more software supported an international audience, `varchar(191)` replaced `varchar(255)` as the default. For those software applications that didn't need to support international users, they also needed to upgrade once users started expecting emoji support (often linked to the rise of smartphones) in the early 2010s. \[^1]: "International" is an odd way to talk about the languages that most of the world uses, specifically those using non-Latin characters. However, operating systems and databases had thoroughly english-centric origins, and the legacy of those early choices is still rippling though our code today. \[^2]: Throughout this post, I've use the word "character" rather than "letter", and this is why - depending on your language, each character in a string might be a letter, a whole word, or even a pictogram like an emoji. \[^3]: "To Honk" (like from a car) - [source](https://words.hk/zidin/%F0%A0%BC%AD) ## Today These days, with modern databases, character encodings like `utf8mb4` and others which can support "all" characters are the default, and the fixed-length index is a thing of the past. However, we still have these 191-character defaults in many applications to ensure compatibility. Regardless, indexes still work best when they know the size of the the strings they are comparing, so we still want to have *some* limit on our column length for speed reasons, and thanks to history and inertia, the 191 limit is still with us. ## Thank you Thanks to all the reference articles I checked when putting together this history, specifically: * [mysql - Wordpress using varchar(255) for index with InnoDB and utf8mb4\_unicode\_ci? - Database Administrators Stack Exchange](https://dba.stackexchange.com/questions/141149/wordpress-using-varchar255-for-index-with-innodb-and-utf8mb4-unicode-ci) * [Why do some fields have a varchar precision of 191 for modUserProfile in the modx schema? - #4 by jeffmiranda - Development - MODX Community](https://community.modx.com/t/why-do-some-fields-have-a-varchar-precision-of-191-for-moduserprofile-in-the-modx-schema/940/4) * [utf 8 - Mysql four byte chinese characters support - Stack Overflow](https://stackoverflow.com/questions/17680237/mysql-four-byte-chinese-characters-support) * [Varchar fields on MySQL 5.7 – gabi.dev](https://gabi.dev/2016/09/08/varchar-fields-on-mysql-5-7/) * [mysql - Why are InnoDB’s index keys limited to 767 bytes? - Database Administrators Stack Exchange](https://dba.stackexchange.com/questions/57005/why-are-innodbs-index-keys-limited-to-767-bytes) * [database - Is there a good reason I see VARCHAR(255) used so often (as opposed to another length)? - Stack Overflow](https://stackoverflow.com/questions/1217466/is-there-a-good-reason-i-see-varchar255-used-so-often-as-opposed-to-another-l) There's also a great discussion of this post on Hacker News - check it out [here](https://news.ycombinator.com/item?id=27186385). --- --- url: /blog/post/2017-02-28-why-choose-actionhero.md description: 'or: Actionhero is the Node.js server for when your project grows up' --- *or: Actionhero is the Node.js server for when your project grows up* ![](/images/medium-export/1__FWU1OZyieAZc____WqiNXGrQ.png) It’s been over 5 years since I started on [Actionhero](https://www.actionherojs.com/), a [Node.js](https://medium.com/u/96cd9a1fb56) server, and I’m very proud of how far we’ve come. We’ve got over 1,500 stars on [GitHub](https://github.com/evantahler/actionhero) and an active [Slack community](https://slack.actionherojs.com/). We are used by [many large companies](https://www.actionherojs.com) in production, and are often cited by many publications as one of the [better](https://firebearstudio.com/blog/top-node-js-rest-api-frameworks.html) Node.JS frameworks. We are even approved by the US Department of Veteran’s Affairs [for use in critical health-care systems](https://www.va.gov/TRM/ToolPage.asp?tid=9029\&tab=2). Until now, I’ve taken a very soft stance on "why" Actionhero might be better than any other server framework/tool for your project, as every project is different. Maybe all of your project’s goals really would be met using only Express, and all you need is a JSON-speaking REST API server. That said, every project I’ve worked on always needed… *more*. Along with the other Actionhero core-contributors, I’ve decided to publish this list of why you might choose to use Actionhero in a mature, enterprise environment. #### Actionhero understands that modern applications speak more than HTTP. Of course Actionhero features a robust RESTful router and HTTP server. But that is not enough any more is it? You probably also want websocket support, right? You also want to share session across HTTP and WS connections, and you want to be able to reuse your code across both. The Actions in ActionHero are agnostic of the communications protocol your clients are speaking, and you can reuse them. Support for all of this is included, right out-of-the-box. #### Actionhero can coordinate with its peers when deployed. Actionhero is "Cluster Aware". This means that ActionHero is built from the ground-up to run in parallel across multiple machines at once. Nodes can speak to each other both passively (via a shared cache and job-queues; included) and actively with direct RPC communication. #### Actionhero knows that background tasks are always required. Sending your client a "welcome" email doesn’t belong in your web thread. Neither does processing anything else in the background. Actionhero treats background jobs as first class citizens, and any part of your API can enqueue them. Actionhero runs job workers in the proper [Node.js](https://medium.com/u/96cd9a1fb56) way: event-based and many at a time. #### Actionhero provides all the help you need, then gets out of the way. Actionhero proposes a standard project layout, including testing and initializers. Actionhero has a REPL and generators to get you up-and-running quickly. Actionhero supports localization and test-driven development. After that… you can do anything you want! #### Mature Operations for a Mature business. Actionhero supports 0-downtime deployments, process signaling, and more. You can be sure that your DevOps team will find running Actionhero pleasant and clear.