<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en_GB"><generator uri="https://jekyllrb.com/" version="3.9.3">Jekyll</generator><link href="https://tlvince.com/feed.xml" rel="self" type="application/atom+xml" /><link href="https://tlvince.com/" rel="alternate" type="text/html" hreflang="en_GB" /><updated>2023-12-29T20:21:29+00:00</updated><id>https://tlvince.com/feed.xml</id><title type="html">Tom Vincent</title><subtitle>Contract Full Stack Developer</subtitle><author><name>Tom Vincent</name></author><entry><title type="html">Prometheus backfilling</title><link href="https://tlvince.com/prometheus-backfilling" rel="alternate" type="text/html" title="Prometheus backfilling" /><published>2021-01-06T17:05:18+00:00</published><updated>2021-01-06T17:05:18+00:00</updated><id>https://tlvince.com/prometheus-backfilling</id><content type="html" xml:base="https://tlvince.com/prometheus-backfilling">&lt;p&gt;Backfill support for Prometheus has &lt;a href=&quot;https://github.com/prometheus/prometheus/issues/535&quot;&gt;been long requested&lt;/a&gt; and with the v2.24.0 release, is finally here!&lt;/p&gt;

&lt;h2 id=&quot;openmetrics-primer&quot;&gt;OpenMetrics primer&lt;/h2&gt;

&lt;p&gt;Prometheus’ backfilling currently only supports the &lt;a href=&quot;https://openmetrics.io/&quot;&gt;OpenMetrics&lt;/a&gt; format, which is a simple text (or protobuf) representation for metrics.&lt;/p&gt;

&lt;p&gt;For example:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;# HELP http_requests_total The total number of HTTP requests.
# TYPE http_requests_total counter
http_requests_total{code=&quot;200&quot;,service=&quot;user&quot;} 123 1609954636
http_requests_total{code=&quot;500&quot;,service=&quot;user&quot;} 456 1609954730
# EOF
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;… where &lt;code&gt;HELP&lt;/code&gt; and &lt;code&gt;TYPE&lt;/code&gt; are &lt;a href=&quot;https://github.com/OpenObservability/OpenMetrics/blob/2474ddafe93217cc4979de0b3148e47a1d9340ad/specification/OpenMetrics.md#metricfamily-metadata&quot;&gt;MetricFamily metadata&lt;/a&gt; giving a brief description of the metric family (set) and its &lt;a href=&quot;https://github.com/OpenObservability/OpenMetrics/blob/2474ddafe93217cc4979de0b3148e47a1d9340ad/specification/OpenMetrics.md#metric-types&quot;&gt;data type&lt;/a&gt;. The &lt;code&gt;http_requests_total&lt;/code&gt; metric family contains two metrics; both with comma-separated labels, a value and a timestamp (Unix time).&lt;/p&gt;

&lt;p&gt;Note, the file (“exposition”) &lt;em&gt;must&lt;/em&gt; end with &lt;code&gt;EOF&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;backfilling&quot;&gt;Backfilling&lt;/h2&gt;

&lt;p&gt;The new backfilling support is &lt;a href=&quot;https://github.com/prometheus/prometheus/blob/v2.24.0/docs/storage.md#backfilling-from-openmetrics-format&quot;&gt;implemented&lt;/a&gt; as the &lt;code&gt;create-blocks-from openmetrics&lt;/code&gt; subcommand to &lt;code&gt;tsdb&lt;/code&gt; via &lt;code&gt;promtool&lt;/code&gt;. Lets give it a try.&lt;/p&gt;

&lt;p&gt;First ensure you’re running v2.24.0 or later. &lt;a href=&quot;https://github.com/prometheus/prometheus/releases&quot;&gt;Binary releases&lt;/a&gt; are conveniently provided if it has yet to land in your distribution.&lt;/p&gt;

&lt;p&gt;If we launch &lt;code&gt;prometheus&lt;/code&gt; with its default configuration, a &lt;code&gt;data&lt;/code&gt; directory is created with the following contents:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;❯ tree data
data
├── chunks_head
├── lock
├── queries.active
└── wal
    └── 00000000

2 directories, 3 files
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Lets run the backfill command:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;❯ ./promtool tsdb create-blocks-from openmetrics metrics
BLOCK ULID                  MIN TIME       MAX TIME       DURATION     NUM SAMPLES  NUM CHUNKS   NUM SERIES   SIZE
01EVCJ6E3XKHCY35AEYYWQB61N  1609954636000  1609954730001  1m34.001s    2            2            2            805
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The new block is created in the &lt;code&gt;data&lt;/code&gt; directory (by default):&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;❯ tree data
data
├── 01EVCJ6E3XKHCY35AEYYWQB61N
│   ├── chunks
│   │   └── 000001
│   ├── index
│   ├── meta.json
│   └── tombstones
├── chunks_head
├── lock
├── queries.active
└── wal
    └── 00000000

4 directories, 7 files
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Restart &lt;code&gt;prometheus&lt;/code&gt;, query on the &lt;code&gt;http_requests_total&lt;/code&gt; metric name, switch to the graph view and there we have it; backfilled metrics.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/assets/img/prometheus-graph-backfilled-metrics.png&quot;&gt;&lt;img src=&quot;/assets/img/th/prometheus-graph-backfilled-metrics.png&quot; alt=&quot;Prometheus graph showing backfilled metrics&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Note, backfilled data is subject to the server’s &lt;a href=&quot;https://github.com/prometheus/prometheus/blob/v2.24.0/docs/storage.md#operational-aspects&quot;&gt;retention configuration&lt;/a&gt;, both size and time. Set these to values that make sense for your data.&lt;/p&gt;

&lt;h2 id=&quot;usecases&quot;&gt;Usecases&lt;/h2&gt;

&lt;p&gt;Why’s backfilling useful? Some ideas:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Migrating historic data to Prometheus&lt;/li&gt;
  &lt;li&gt;Restoring metrics after system downtime&lt;/li&gt;
  &lt;li&gt;Generating fake metrics to be used as seed data, for example:&lt;/li&gt;
&lt;/ol&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;#!/usr/bin/env bash
set -euo pipefail

hour=&quot;$(( $(date +%H) - 1))&quot;
dateHour=&quot;$(date -I)T$(printf %02g $hour)&quot;

cat &amp;lt;&amp;lt; EOF
# HELP http_requests_total The total number of HTTP requests.
# TYPE http_requests_total counter
EOF

for i in {0..59}; do
  for status in 200 500; do
    echo &quot;http_requests_total{code=\&quot;$status\&quot;,service=\&quot;user\&quot;} $RANDOM $(date -d &quot;${dateHour}:$(printf %02g &quot;$i&quot;):00&quot; +%s)&quot;
  done
done

echo &quot;# EOF&quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;a href=&quot;/assets/img/prometheus-graph-backfilled-metrics-seed.png&quot;&gt;&lt;img src=&quot;/assets/img/th/prometheus-graph-backfilled-metrics-seed.png&quot; alt=&quot;Prometheus graph from seed data&quot; /&gt;&lt;/a&gt;&lt;/p&gt;</content><author><name>Tom Vincent</name></author><summary type="html">Backfill support for Prometheus has been long requested and with the v2.24.0 release, is finally here!</summary></entry><entry><title type="html">Decorated Lambda handlers</title><link href="https://tlvince.com/decorated-lambda-handlers" rel="alternate" type="text/html" title="Decorated Lambda handlers" /><published>2020-03-31T00:00:00+00:00</published><updated>2020-03-31T00:00:00+00:00</updated><id>https://tlvince.com/decorated-lambda-handlers</id><content type="html" xml:base="https://tlvince.com/decorated-lambda-handlers">&lt;p&gt;The main sell of AWS Lambda (and Functions as as Service in general) is the ability to shift developer attention away from infrastructure to the business logic. Nonetheless, there are a number of cross-cutting concerns that Lambdas need to &lt;em&gt;handle&lt;/em&gt;. This post outlines some of these and how they can be addressed.&lt;/p&gt;

&lt;p&gt;Note, this focuses on the Node.js runtime, but the same principles can be applied to others.&lt;/p&gt;

&lt;h2 id=&quot;structured-logging&quot;&gt;Structured logging&lt;/h2&gt;

&lt;p&gt;As every Lambda function is automatically set up with a &lt;a href=&quot;https://docs.aws.amazon.com/lambda/latest/dg/monitoring-functions-logs.html&quot;&gt;AWS CloudWatch Log group&lt;/a&gt;, debugging can be as simple as adding a &lt;code&gt;console.log&lt;/code&gt;. This can often be enough for simpler cases, but as projects grow, so does the need for logs. Perhaps your system is composed of multiple Lambdas and you need to search across them, or you need to run aggregations. Whilst this can be solved with regexs, writing logs in a machine-readable format such as JSON simplifies parsing and querying.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://getpino.io&quot;&gt;Pino&lt;/a&gt; is a lightweight structured logging library that works well with Lambda. Using its &lt;a href=&quot;https://getpino.io/#/docs/api?id=base-object&quot;&gt;base option&lt;/a&gt;, we can decorate all log lines with the Lambda’s &lt;a href=&quot;https://docs.aws.amazon.com/lambda/latest/dg/configuration-envvars.html#configuration-envvars-runtime&quot;&gt;runtime context&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;const pino = require('pino')

const logger = pino({
  base: {
    memorySize: process.env.AWS_LAMBDA_FUNCTION_MEMORY_SIZE,
    region: process.env.AWS_REGION,
    runtime: process.env.AWS_EXECUTION_ENV,
    version: process.env.AWS_LAMBDA_FUNCTION_VERSION,
  },
  name: process.env.AWS_LAMBDA_FUNCTION_NAME,
  level: process.env.LOG_LEVEL || 'info',
  useLevelLabels: true,
})

exports.handler = () =&amp;gt; {
  logger.info({ uuid: 'foo' }, 'hello world')
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Results in logs such as:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;{
  &quot;level&quot;: &quot;info&quot;,
  &quot;memorySize&quot;: &quot;128&quot;,
  &quot;msg&quot;: &quot;hello world&quot;,
  &quot;name&quot;: &quot;my-lambda&quot;,
  &quot;region&quot;: &quot;eu-west-2&quot;,
  &quot;runtime&quot;: &quot;AWS_Lambda_nodejs12.x&quot;,
  &quot;time&quot;: 1493426328206,
  &quot;uuid&quot;: &quot;foo&quot;,
  &quot;v&quot;: 1,
  &quot;version&quot;: &quot;$LATEST&quot;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;CloudWatch Logs has first-party support for &lt;a href=&quot;https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/FilterAndPatternSyntax.html#matching-terms-events-json&quot;&gt;JSON filters&lt;/a&gt;. For example, to filter log lines containing the &lt;code&gt;foo&lt;/code&gt; UUID, use &lt;code&gt;{ $.uuid = &quot;foo&quot; }&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/assets/img/cloudwatch-log-json-filtering.png&quot;&gt;&lt;img src=&quot;/assets/img/th/cloudwatch-log-json-filtering.png&quot; alt=&quot;CloudWatch Log JSON filtering&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;instrumentation&quot;&gt;Instrumentation&lt;/h2&gt;

&lt;p&gt;As a distributed system grows, debugging becomes harder. Microservice and serverless architectures are composed of many services interacting with each other. When there’s a problem, it can be difficult to identify which service in the mesh is at fault.&lt;/p&gt;

&lt;p&gt;Yan Cui’s &lt;a href=&quot;https://theburningmonk.com/2017/09/capture-and-forward-correlation-ids-through-different-lambda-event-sources/&quot;&gt;Capture and forward correlation IDs through different Lambda event sources&lt;/a&gt; outlines how correlation IDs can be used to alleviate this. In the same way as identifiers such as a &lt;code&gt;uuid&lt;/code&gt; can be logged to provide context, other identifiers can be used to thread messages together as they flow through the system.&lt;/p&gt;

&lt;p&gt;AWS Lambda includes &lt;code&gt;awsRequestId&lt;/code&gt; in its &lt;a href=&quot;https://docs.aws.amazon.com/lambda/latest/dg/nodejs-prog-model-context.html&quot;&gt;context object&lt;/a&gt;, which is unique per invocation. When set up as a integration in API Gateway, this provides a way to trace a request back its initial API call. However, this ID is not automatically forwarded to further downstream services e.g. other AWS services or third-party APIs.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://aws.amazon.com/xray/&quot;&gt;AWS X-Ray&lt;/a&gt; is a fully-featured tracing system that provides this functionality out of the box. In automatic mode (the default), all outgoing HTTP(S) requests can be instrumented using the &lt;a href=&quot;https://github.com/aws/aws-xray-sdk-node/blob/e1abf865217ddc87b54819a20f5df75937a2978b/packages/core/README.md&quot;&gt;captureHTTPsGlobal&lt;/a&gt; method:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;const https = require('https')
const AWSXRay = require('aws-xray-sdk-core')

exports.handler = async () =&amp;gt; {
  AWSXRay.captureHTTPsGlobal(https)
  await got('https://tlvince.com')
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Note, this works by &lt;a href=&quot;https://en.wikipedia.org/wiki/Monkey_patch&quot;&gt;monkey patching&lt;/a&gt; the core Node.js &lt;code&gt;http&lt;/code&gt;/&lt;code&gt;https&lt;/code&gt; modules, which can be dangerous. Alternatively, X-Ray’s scope can be reduced to AWS calls using the &lt;code&gt;captureAWS&lt;/code&gt; method.&lt;/p&gt;

&lt;p&gt;For completeness, we can also add the X-Ray trace ID as well as the &lt;code&gt;awsRequestId&lt;/code&gt; to the logs for easier cross-referencing. One gotcha to remember is &lt;a href=&quot;https://docs.aws.amazon.com/lambda/latest/dg/downstream-tracing.html&quot;&gt;neither IDs are set&lt;/a&gt; until the function has been executed, so will be &lt;code&gt;undefined&lt;/code&gt; if referenced in the function’s global context rather than inside its handler. To workaround this, use a Pino &lt;a href=&quot;https://getpino.io/#/docs/api?id=child&quot;&gt;child logger&lt;/a&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;const https = require('https')
const got = require('got')
const pino = require('pino')
const AWSXRay = require('aws-xray-sdk-core')

const parentLogger = pino({
  base: {
    memorySize: process.env.AWS_LAMBDA_FUNCTION_MEMORY_SIZE,
    region: process.env.AWS_REGION,
    runtime: process.env.AWS_EXECUTION_ENV,
    version: process.env.AWS_LAMBDA_FUNCTION_VERSION,
  },
  name: process.env.AWS_LAMBDA_FUNCTION_NAME,
  level: process.env.LOG_LEVEL || 'info',
  useLevelLabels: true,
})

exports.handler = (event, context) =&amp;gt; {
  AWSXRay.captureHTTPsGlobal(https)

  const logger = parentLogger.child({
    traceId: process.env._X_AMZN_TRACE_ID,
    awsRequestId: context.awsRequestId,
  })

  logger.info({ uuid: 'foo' }, 'hello world')
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;event-validation&quot;&gt;Event validation&lt;/h2&gt;

&lt;p&gt;Probably the most important technical concern for any externally-facing service is to validate its inputs. Doing this upfront helps guard against malformed (or malicious) events, helps simplify property references within the business logic and can also help reduce costs by short-circuiting the function early.&lt;/p&gt;

&lt;p&gt;Depending on your needs, a JSON schema validator such as &lt;a href=&quot;https://github.com/epoberezkin/ajv&quot;&gt;ajv&lt;/a&gt; is typically the go-to option. &lt;a href=&quot;https://github.com/eivindfjeldstad/validate&quot;&gt;validate&lt;/a&gt; is a lightweight alternative, which trades expressiveness at the expense of schema interoperability.&lt;/p&gt;

&lt;p&gt;An example for SQS events:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;const Schema = require('validate')

const schema = new Schema({
  Records: [
    {
      body: {
        type: String,
        required: true,
      },
    },
  ],
})

exports.handler = event =&amp;gt; {
  const errors = schema.validate(event, { strip: false })
  if (errors.length) {
    throw new Error(error)
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Note, &lt;code&gt;{ strip: false }&lt;/code&gt; is used to ensure &lt;code&gt;validate&lt;/code&gt; does not mutate the event object.&lt;/p&gt;

&lt;h2 id=&quot;environment-variable-validation&quot;&gt;Environment variable validation&lt;/h2&gt;

&lt;p&gt;In the same manner as input event validation, environment variables can be validated via a simple &lt;code&gt;process.env&lt;/code&gt; check:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;const requiredEnvs = ['FOO']
const missingEnvs = requiredEnvs.filter(requiredEnv =&amp;gt; !process.env[requiredEnv])
if (missingEnvs.length) {
  throw new Error(`missing environment variables ${missingEnvs}`)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;reusing-http-connections&quot;&gt;Reusing HTTP connections&lt;/h2&gt;

&lt;p&gt;A neat performance optimisation I learnt from &lt;a href=&quot;https://vimeo.com/287511222&quot;&gt;Matt Lavin’s Node Summit 2018 talk&lt;/a&gt; was that HTTP connections can be reused. By default, Node.js’s HTTP agent does not use &lt;a href=&quot;https://en.wikipedia.org/wiki/HTTP_persistent_connection&quot;&gt;keep-alive&lt;/a&gt; and therefore every request incurs the overheads of establishing a new TCP connection.&lt;/p&gt;

&lt;p&gt;Since the majority of HTTP requests made by Lambdas are to other AWS services, it makes sense to scope this optimisation first and observe its effect:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;const AWS = require('aws-sdk')
const https = require('https')

const agent = new https.Agent({
  keepAlive: true,
})

AWS.config.update({
  httpOptions: {
    agent,
  },
})
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Since &lt;a href=&quot;https://github.com/aws/aws-sdk-js/blob/master/CHANGELOG.md#24630&quot;&gt;aws-sdk 2.463.0&lt;/a&gt;, this is further simplified by setting the &lt;code&gt;AWS_NODEJS_CONNECTION_REUSE_ENABLED&lt;/code&gt; environment variable. The configuration can therefore be removed from the handler and moved to your infrastructure as code tool of choice.&lt;/p&gt;

&lt;h2 id=&quot;decorator-example&quot;&gt;Decorator example&lt;/h2&gt;

&lt;p&gt;Each of these concerns can be combined together into a re-usable decorator function. For example:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;const https = require('https')

const pino = require('pino')
const Schema = require('validate')
const AWSXRay = require('aws-xray-sdk-core')

const parentLogger = pino({
  base: {
    memorySize: process.env.AWS_LAMBDA_FUNCTION_MEMORY_SIZE,
    region: process.env.AWS_REGION,
    runtime: process.env.AWS_EXECUTION_ENV,
    version: process.env.AWS_LAMBDA_FUNCTION_VERSION,
  },
  name: process.env.AWS_LAMBDA_FUNCTION_NAME,
  level: process.env.LOG_LEVEL || 'info',
  useLevelLabels: true,
})

module.exports = ({ handler, requiredEnvs = [], eventSchema = {} }) =&amp;gt; (
  event,
  context
) =&amp;gt; {
  AWSXRay.captureHTTPsGlobal(https)

  const logger = parentLogger.child({
    traceId: process.env._X_AMZN_TRACE_ID,
    awsRequestId: context.awsRequestId,
  })

  const schema = new Schema(eventSchema)
  const errors = schema.validate(event, { strip: false })
  if (errors.length) {
    logger.debug({ errors }, 'event validation errors')
    throw new Error(errors)
  }

  const missingEnvs = requiredEnvs.filter(
    requiredEnv =&amp;gt; !process.env[requiredEnv]
  )

  if (missingEnvs.length) {
    logger.debug({ missingEnvs }, 'missing environment variables')
    throw new Error(`missing environment variables ${missingEnvs}`)
  }

  return handler(event, context, { logger })
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The Lambda handler body itself can then be simplified to focussing on the business logic, besides a few lines of configuration:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;const decoratedHandler = require('./handler-decorator')

const handler = async (event, context, { logger }) =&amp;gt; {
  logger.debug('reached Lambda handler')
  return event.Records.map(record =&amp;gt; record.body)
}

exports.handler = decoratedHandler({
  handler,
  requiredEnvs: ['FOO'],
  eventSchema: {
    Records: [
      {
        body: {
          type: String,
          required: true,
        },
      },
    ],
  },
})
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;By extracting noisy yet necessary boilerplate, Lambda handlers can be kept lean and focussed on their business logic. A number of cross-cutting concerns were discussed, with an approach to encapsulate them using a reusable function following the decorator pattern. Alternatives include &lt;a href=&quot;https://middy.js.org/&quot;&gt;middy&lt;/a&gt;, a more pluggable, middleware-based approach or &lt;a href=&quot;https://lambda-decorators.readthedocs.io/en/latest/&quot;&gt;lambda_decorators&lt;/a&gt; for the Python runtime.&lt;/p&gt;</content><author><name>Tom Vincent</name></author><summary type="html">The main sell of AWS Lambda (and Functions as as Service in general) is the ability to shift developer attention away from infrastructure to the business logic. Nonetheless, there are a number of cross-cutting concerns that Lambdas need to handle. This post outlines some of these and how they can be addressed.</summary></entry><entry><title type="html">Terraforming Lambdas</title><link href="https://tlvince.com/terraforming-lambdas" rel="alternate" type="text/html" title="Terraforming Lambdas" /><published>2020-02-07T00:00:00+00:00</published><updated>2020-02-07T00:00:00+00:00</updated><id>https://tlvince.com/terraforming-lambdas</id><content type="html" xml:base="https://tlvince.com/terraforming-lambdas">&lt;p&gt;When provisioning a Lambda function with Terraform, one gotcha to remember is that Terraform expects the deployment package to exist before it can create the function itself. Put another way, the &lt;em&gt;infrastructure&lt;/em&gt; code depends on the &lt;em&gt;application&lt;/em&gt; code.&lt;/p&gt;

&lt;p&gt;One way of handling this is to manage both the function logic and its provisioning in Terraform using a &lt;a href=&quot;https://www.terraform.io/docs/providers/aws/r/lambda_function.html#specifying-the-deployment-package&quot;&gt;local file deployment package&lt;/a&gt;:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/assets/img/terraform-managed-app.png&quot;&gt;&lt;img src=&quot;/assets/img/th/terraform-managed-app.png&quot; alt=&quot;Terraform managing application code&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This ensures Terraform can build out its dependency graph correctly and so can create the deployment package before the function.&lt;/p&gt;

&lt;p&gt;However, there are a number of downsides to this approach. Firstly, as the docs mention, Terraform is unoptimised for handling large file uploads. It does not handle multi-part or resuming.&lt;/p&gt;

&lt;p&gt;Secondly, because &lt;a href=&quot;https://www.terraform.io/docs/providers/aws/r/lambda_function.html#source_code_hash&quot;&gt;source_code_hash&lt;/a&gt; is a computed property (its value isn’t known until &lt;code&gt;terraform apply&lt;/code&gt; is ran), Terraform is often overly-cautious in deciding when the deployment package has changed. More often than not, this results in Terraform creating a new version (and therefore reuploading the deployment package) on &lt;em&gt;every&lt;/em&gt; run.&lt;/p&gt;

&lt;h2 id=&quot;decoupling-application-code-from-terraform&quot;&gt;Decoupling application code from Terraform&lt;/h2&gt;

&lt;p&gt;Another approach is to decouple infrastructure from application code. In this approach, Terraform creates a placeholder deployment package to fulfil its dependency requirement and the deployment of the &lt;em&gt;real&lt;/em&gt; application code is managed outside of Terraform, ideally in its own automation step:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/assets/img/decoupled-infra-app.png&quot;&gt;&lt;img src=&quot;/assets/img/th/decoupled-infra-app.png&quot; alt=&quot;Decoupled infrastructure and app code&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;An implementation of this (in Terraform 0.12.x) uses the &lt;code&gt;archive_file&lt;/code&gt; provider along with the &lt;code&gt;s3_key&lt;/code&gt; and &lt;code&gt;s3_bucket&lt;/code&gt; attributes in the Lambda resource:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-hcl&quot;&gt;data &quot;archive_file&quot; &quot;my_lambda_placeholder_zip&quot; {
  type        = &quot;zip&quot;
  output_path = &quot;${path.module}/lambda/my_lambda.zip&quot;

  source {
    content  = &quot;exports.handler = () =&amp;gt; {}&quot;
    filename = &quot;index.js&quot;
  }
}

resource &quot;aws_s3_bucket_object&quot; &quot;my_lambda&quot; {
  bucket = aws_s3_bucket.deployment.id
  key    = &quot;lambda/connection-manager.zip&quot;
  source = data.archive_file.core_placeholder_zip.output_path
}

resource &quot;aws_lambda_function&quot; &quot;my_lambda&quot; {
  function_name = &quot;my-lambda&quot;
  description   = &quot;Decoupled Lambda deployment example&quot;
  s3_bucket     = aws_s3_bucket.deployment.id
  s3_key        = aws_s3_bucket_object.my_lambda.id
  handler       = &quot;index.handler&quot;
  runtime       = &quot;nodejs12.x&quot;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The application deployment step is then a few lines of shell:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-shell&quot;&gt;#!/bin/sh

cd /path/to/my-lambda
npm run build
cd dist
zip -9rX &quot;my-lambda.zip&quot; .
aws lambda update-function-code \
  --function-name &quot;my-lambda&quot; \
  --zip-file &quot;fileb://dist/my-lambda.zip&quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;By decoupling infrastructure from application provisioning in Terraform, we trade managing part of the stack outside of Terraform with the ability to optimise the deployment of application code. Issues surrounding change detection on often large deployment artefacts are resolved and uploads are more efficiently handled by the AWS CLI.&lt;/p&gt;

&lt;p&gt;Typically a function’s configuration and dependant infrastructure changes less than application logic itself. By decoupling the two, the risk of failure between infrastructure changesets is reduced.&lt;/p&gt;</content><author><name>Tom Vincent</name></author><summary type="html">When provisioning a Lambda function with Terraform, one gotcha to remember is that Terraform expects the deployment package to exist before it can create the function itself. Put another way, the infrastructure code depends on the application code.</summary></entry><entry><title type="html">Lambdaless</title><link href="https://tlvince.com/lambdaless" rel="alternate" type="text/html" title="Lambdaless" /><published>2019-01-01T00:00:00+00:00</published><updated>2019-01-01T00:00:00+00:00</updated><id>https://tlvince.com/lambdaless</id><content type="html" xml:base="https://tlvince.com/lambdaless">&lt;p&gt;Lets assume you need to expose a JSON file behind an API. Using a serverless approach with AWS, you might first reach for an architecture like the following:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/assets/img/api-gateway-to-lambda-to-s3.png&quot;&gt;&lt;img src=&quot;/assets/img/th/api-gateway-to-lambda-to-s3.png&quot; alt=&quot;API Gateway to Lambda to S3&quot; title=&quot;API Gateway to Lambda to S3&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;… i.e. an API Gateway in front of a Lambda, which calls S3. Alternatively, did you know you could remove the Lambda and have API Gateway call S3 directly?&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/assets/img/api-gateway-to-s3.png&quot;&gt;&lt;img src=&quot;/assets/img/th/api-gateway-to-s3.png&quot; alt=&quot;API Gateway to Lambda&quot; title=&quot;API Gateway to S3&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is what I call “Lambdaless”. It leverages API Gateway’s &lt;code&gt;AWS&lt;/code&gt; &lt;a href=&quot;https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-api-integration-types.html&quot;&gt;integration type&lt;/a&gt;, which allows you to expose any AWS service without any intermediate application logic. &lt;a href=&quot;https://docs.aws.amazon.com/apigateway/latest/developerguide/models-mappings.html&quot;&gt;Mapping templates&lt;/a&gt; provide the glue to transform request/responses, using the &lt;a href=&quot;https://velocity.apache.org/engine/devel/vtl-reference.html&quot;&gt;Velocity&lt;/a&gt; templating language (VTL) and &lt;a href=&quot;https://goessner.net/articles/JsonPath/&quot;&gt;JSONPath&lt;/a&gt; expressions.&lt;/p&gt;

&lt;h2 id=&quot;walkthrough&quot;&gt;Walkthrough&lt;/h2&gt;

&lt;p&gt;Continuing with the S3 example above, create an API Gateway with a GET method and set up the integration request per the following:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;choose the AWS service type, region and Simple Storage Service (S3)&lt;/li&gt;
  &lt;li&gt;select the GET HTTP method&lt;/li&gt;
  &lt;li&gt;select the “use path override” action type&lt;/li&gt;
  &lt;li&gt;enter the object’s &lt;code&gt;&amp;lt;bucket&amp;gt;/&amp;lt;prefix&amp;gt;&lt;/code&gt; in the path override field&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href=&quot;/assets/img/api-gateway-s3-integration-request.png&quot;&gt;&lt;img src=&quot;/assets/img/th/api-gateway-s3-integration-request.png&quot; alt=&quot;API Gateway S3 Integration Request&quot; title=&quot;API Gateway S3 Integration Request&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Create an IAM role that has a policy that has &lt;code&gt;s3:GetObject&lt;/code&gt; permission on your &lt;code&gt;&amp;lt;bucket&amp;gt;/&amp;lt;prefix&amp;gt;&lt;/code&gt; and a Trust Relationship that allows the API Gateway to assume it to be so. Now all you need to do is switch to the test view, click “test” and you should see the contents of your JSON object in the response body:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/assets/img/api-gateway-s3-response.png&quot;&gt;&lt;img src=&quot;/assets/img/th/api-gateway-s3-response.png&quot; alt=&quot;API Gateway S3 Request&quot; title=&quot;API Gateway S3 Response&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;examples&quot;&gt;Examples&lt;/h2&gt;

&lt;h3 id=&quot;mock-integration&quot;&gt;Mock integration&lt;/h3&gt;

&lt;p&gt;Taking the JSON example to its logical conclusion, we can go a step further and remove S3 from the equation altogether. Choose the &lt;code&gt;MOCK&lt;/code&gt; integration type, add the required &lt;code&gt;{&quot;statusCode&quot;: 200}&lt;/code&gt; request mapping template and move the contents of your JSON object to the integration response mapping template.&lt;/p&gt;

&lt;p&gt;This approach typically yields ~3ms response times (compared to ~65ms with the additional hop to S3) and is a good solution for static data.&lt;/p&gt;

&lt;h3 id=&quot;dynamodb&quot;&gt;DynamoDB&lt;/h3&gt;

&lt;p&gt;Simple CRUD APIs with DynamoDB are a great fit for Lambdaless. API Gateway’s &lt;a href=&quot;https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html#context-variables-template-example&quot;&gt;$context variables&lt;/a&gt; includes &lt;code&gt;$context.requestId&lt;/code&gt;, which can be used as a entity’s UUID, along with &lt;code&gt;$context.requestTimeEpoch&lt;/code&gt; for created/updated at timestamps.&lt;/p&gt;

&lt;p&gt;Request/response templates can be used to convert to/from DynamoDB’s &lt;a href=&quot;https://docs.aws.amazon.com/amazondynamodb/latest/developerguide/Programming.LowLevelAPI.html#Programming.LowLevelAPI.DataTypeDescriptors&quot;&gt;data type descriptors&lt;/a&gt;, for example:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-vtl&quot;&gt;#set($inputRoot = $input.path('$'))
{
  &quot;TableName&quot;: &quot;my-table&quot;,
  &quot;Key&quot;: {
    &quot;uuid&quot;: {
      &quot;S&quot;: &quot;$context.requestId&quot;
    }
  },
  &quot;Item&quot;: {
    &quot;uuid&quot;: {
      &quot;S&quot;: &quot;$context.requestId&quot;
    },
    &quot;name&quot;: {
      &quot;S&quot;: &quot;$inputRoot.name&quot;
    },
    &quot;items&quot;: {
      &quot;L&quot;: [
        #foreach($item in $inputRoot.items)
        {
          &quot;S&quot;: &quot;$item&quot;
        }#if($foreach.hasNext),#end
        #end
      ]
    },
    &quot;createdAt&quot;: {
      &quot;N&quot;: &quot;$context.requestTimeEpoch&quot;
    }
  }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;h3 id=&quot;other-ideas&quot;&gt;Other ideas&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;use the &lt;code&gt;HTTP_PROXY&lt;/code&gt; integration to bypass region-locked websites&lt;/li&gt;
  &lt;li&gt;pump events into an SQS queue&lt;/li&gt;
  &lt;li&gt;raise AWS Support tickets using your existing customer service solution&lt;/li&gt;
&lt;/ul&gt;

&lt;h2 id=&quot;advantages&quot;&gt;Advantages&lt;/h2&gt;

&lt;p&gt;A simple Lambda may seem innocuous at first, but each function comes with their own maintenance cost including:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;maintaining the application code&lt;/li&gt;
  &lt;li&gt;maintaining dependencies&lt;/li&gt;
  &lt;li&gt;any CI/CD tooling around delivering that code&lt;/li&gt;
  &lt;li&gt;performing runtime upgrades&lt;/li&gt;
  &lt;li&gt;security scanning&lt;/li&gt;
  &lt;li&gt;configuring monitoring and alerts (e.g. CloudWatch)&lt;/li&gt;
  &lt;li&gt;configuring instrumentation (e.g. X-Ray)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Removing a Lambda means fewer resources to maintain, test and pay for.
Latency is also reduced. There are less hops in the chain and the issue of cold starts disappears.&lt;/p&gt;

&lt;h2 id=&quot;disadvantages&quot;&gt;Disadvantages&lt;/h2&gt;

&lt;p&gt;There are however a number drawbacks to consider with this Lambdaless method. Probably most apparent is the fact that you can only integrate with a single service at a time. This limits the approach to simple integrations and rules out complex logic e.g. joins.&lt;/p&gt;

&lt;p&gt;Velocity, whilst offering some level of &lt;a href=&quot;https://velocity.apache.org/engine/devel/vtl-reference.html#directives&quot;&gt;control flow&lt;/a&gt; such as &lt;code&gt;if/else&lt;/code&gt; and loops, as well as AWS’s own extensions such as &lt;a href=&quot;https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html#util-template-reference&quot;&gt;util functions&lt;/a&gt;, is somewhat of a niche language and introduces its own complexity over using your Lambda runtime language of choice (e.g. JavaScript, Python).&lt;/p&gt;

&lt;p&gt;This approach is also tightly coupled with API Gateway. The &lt;code&gt;AWS&lt;/code&gt; integration type and request/response mapping template approach is unique to API Gateway and therefore is less portable than Lambda application logic (which is easier to abstract from the Lambda environment itself).&lt;/p&gt;

&lt;p&gt;It also relies on “low-level” AWS APIs, which are less accessible and often sparsely documented compared to their corresponding SDK wrappers.&lt;/p&gt;

&lt;h2 id=&quot;further-reading&quot;&gt;Further reading&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://aws.amazon.com/blogs/compute/using-amazon-api-gateway-as-a-proxy-for-dynamodb/&quot;&gt;Using Amazon API Gateway as a proxy for DynamoDB&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://medium.com/hackernoon/serverless-and-lambdaless-scalable-crud-data-api-with-aws-api-gateway-and-dynamodb-626161008bb2&quot;&gt;Serverless and Lambdaless Scalable CRUD Data API with AWS API Gateway and DynamoDB&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Thanks&lt;/strong&gt; to &lt;a href=&quot;https://github.com/kahlos&quot;&gt;Callum Vincent&lt;/a&gt; for reading drafts of this.&lt;/p&gt;</content><author><name>Tom Vincent</name></author><summary type="html">Lets assume you need to expose a JSON file behind an API. Using a serverless approach with AWS, you might first reach for an architecture like the following:</summary></entry><entry><title type="html">Pandoc on TravisCI</title><link href="https://tlvince.com/pandoc-on-travisci" rel="alternate" type="text/html" title="Pandoc on TravisCI" /><published>2017-08-19T00:00:00+00:00</published><updated>2017-08-19T00:00:00+00:00</updated><id>https://tlvince.com/pandoc-on-travisci</id><content type="html" xml:base="https://tlvince.com/pandoc-on-travisci">&lt;p&gt;A few approaches of running &lt;a href=&quot;https://pandoc.org/&quot;&gt;Pandoc&lt;/a&gt; in &lt;a href=&quot;https://travis-ci.com/&quot;&gt;TravisCI&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;1-sudo--apt-get&quot;&gt;1. sudo &amp;amp; apt-get&lt;/h2&gt;

&lt;p&gt;Using Travis’ &lt;a href=&quot;https://docs.travis-ci.com/user/installing-dependencies#Installing-Packages-on-Standard-Infrastructure&quot;&gt;standard infrastructure&lt;/a&gt;, you can simply use &lt;code&gt;apt-get&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;sudo: true
before_install:
  - sudo apt-get -qq update
  - sudo apt-get install -y pandoc
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Depending on what Travis’ current Linux environment is (Ubuntu Trusty at the time of writing), this may be all you need. However, you may be limited to an old version of Pandoc (Trusty currently has &lt;a href=&quot;https://packages.ubuntu.com/trusty/pandoc&quot;&gt;v1.12.2&lt;/a&gt;).&lt;/p&gt;

&lt;h2 id=&quot;2-without-sudo--apt-addon&quot;&gt;2. Without sudo &amp;amp; APT addon&lt;/h2&gt;

&lt;p&gt;Using Travis’ &lt;a href=&quot;https://docs.travis-ci.com/user/installing-dependencies#Installing-Packages-on-Container-Based-Infrastructure&quot;&gt;container infrastructure&lt;/a&gt; (Docker), as pandoc is in the &lt;a href=&quot;https://github.com/travis-ci/apt-package-whitelist/search?utf8=%E2%9C%93&amp;amp;q=pandoc&amp;amp;type=&quot;&gt;APT addon whitelist&lt;/a&gt;, you can do:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;addons:
  apt:
    packages:
      - pandoc
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;However, as before, this limits you to the version of pandoc currently in the Ubuntu repos.&lt;/p&gt;

&lt;h2 id=&quot;3-with-sudo-without-an-apt-repo&quot;&gt;3. With sudo, without an APT repo&lt;/h2&gt;

&lt;p&gt;As pandoc helpfully ships &lt;code&gt;.deb&lt;/code&gt; packages in its &lt;a href=&quot;https://github.com/jgm/pandoc/releases&quot;&gt;GitHub releases&lt;/a&gt;, you can download the &lt;code&gt;.deb&lt;/code&gt; and install it manually.&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;sudo: true
before_install:
  - curl -L https://github.com/jgm/pandoc/releases/download/1.19.2.1/pandoc-1.19.2.1-1-amd64.deb &amp;gt; pandoc.deb
  - sudo dpkg -i pandoc.deb
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The benefit here being you can choose any version of Pandoc, so long as they continue to ship a &lt;code&gt;.deb&lt;/code&gt; for the right architecture.&lt;/p&gt;

&lt;h2 id=&quot;4-without-sudo-without-an-apt-repo&quot;&gt;4. Without sudo, without an APT repo&lt;/h2&gt;

&lt;p&gt;Taking the above further, we manually extract the &lt;code&gt;.deb&lt;/code&gt; without &lt;code&gt;sudo&lt;/code&gt; and thereby have faster job startup times (&lt;code&gt;sudo&lt;/code&gt;/non-container based infrastructure jobs take ~20 secs to spin up).&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-yaml&quot;&gt;before_install:
  - curl -L https://github.com/jgm/pandoc/releases/download/1.19.2.1/pandoc-1.19.2.1-1-amd64.deb &amp;gt; pandoc.deb
  - dpkg -x pandoc.deb .
  - export PATH=&quot;$PWD/usr/bin:$PATH&quot;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Note, this only works as Pandoc is built statically and is liable to break. However, coupled with caching, this method produces the fastest builds with arbitary Pandoc versions.&lt;/p&gt;

&lt;p&gt;See &lt;a href=&quot;https://github.com/tlvince/talks/blob/c8f6d3ecd25f3fdd7c0db61fb498857a9fc4809a/.travis.yml&quot;&gt;tlvince/talks/.travis.yml&lt;/a&gt; for a version with caching.&lt;/p&gt;</content><author><name>Tom Vincent</name></author><summary type="html">A few approaches of running Pandoc in TravisCI.</summary></entry><entry><title type="html">Composable Yeoman Generators</title><link href="https://tlvince.com/composable-yeoman-generators" rel="alternate" type="text/html" title="Composable Yeoman Generators" /><published>2014-08-08T11:38:49+00:00</published><updated>2014-08-08T11:38:49+00:00</updated><id>https://tlvince.com/composable-yeoman-generators</id><content type="html" xml:base="https://tlvince.com/composable-yeoman-generators">&lt;p&gt;Yeoman generator &lt;a href=&quot;https://github.com/yeoman/generator/releases/tag/v0.17.0-pre.1&quot;&gt;v0.17.0&lt;/a&gt; included a useful new feature dubbed
&lt;a href=&quot;http://yeoman.io/authoring/composability.html&quot;&gt;composability&lt;/a&gt;. If you’ve ever wanted to reuse generators by calling one from
another, this is the feature you’ve been waiting for. Here’s a quick overview
of how you might use it.&lt;/p&gt;

&lt;h2 id=&quot;creating-a-generator&quot;&gt;Creating a generator&lt;/h2&gt;

&lt;p&gt;Lets begin by creating a new generator. The Yeoman team have made it trivial to
get started via &lt;a href=&quot;https://github.com/yeoman/generator-generator&quot;&gt;generator-generator&lt;/a&gt;, so lets fire it up:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;npm install -g yo generator-generator
mkdir my-generator &amp;amp;&amp;amp; cd my-generator
yo generator
cd generator-my-generator
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;Note&lt;/em&gt;, as of generator-generator v0.4.4, an older version of yeoman-generator
without composability support is used. So first confirm that
&lt;code&gt;&quot;yeoman-generator&quot;: &quot;~0.17.0&quot;&lt;/code&gt; is listed in &lt;code&gt;package.json&lt;/code&gt; or update
accordingly.&lt;/p&gt;

&lt;p&gt;generator-generator produces commonly used templates such as &lt;code&gt;.jshintrc&lt;/code&gt; and
&lt;code&gt;.editorconfig&lt;/code&gt; for us, but wouldn’t it be nice if these were maintained
elsewhere? That’s where &lt;a href=&quot;https://github.com/eddiemonge/generator-common&quot;&gt;generator-common&lt;/a&gt; comes in.&lt;/p&gt;

&lt;h2 id=&quot;composability&quot;&gt;Composability&lt;/h2&gt;

&lt;p&gt;Here we’ll use &lt;code&gt;composeWith&lt;/code&gt; to programmatically call generator-common from our
new generator. Lets remove the pre-generated templates and methods:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;rm -rf app/templates
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;&lt;code&gt;app/index.js&lt;/code&gt;&lt;/em&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;'use strict';

var yeoman = require('yeoman-generator');

var MyGeneratorGenerator = yeoman.generators.Base.extend({
 // Prototype methods
});

module.exports = MyGeneratorGenerator;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;By default, Yeoman calls every method in the generator’s prototype in sequence.
So lets add a new method — &lt;code&gt;templates&lt;/code&gt; — that calls generator-common:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;var MyGeneratorGenerator = yeoman.generators.Base.extend({
  templates: function() {
    this.composeWith('common', {});
  }
});
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Lets give it a try:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;npm link
yo my-generator
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If you haven’t previously installed generator-common, you’ll likely be shown an
error similar to:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;You don’t seem to have a generator with the name common installed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;By default, &lt;code&gt;composeWith&lt;/code&gt; hooks into npm’s &lt;a href=&quot;http://blog.nodejs.org/2013/02/07/peer-dependencies/&quot;&gt;peerDependencies&lt;/a&gt; to resolve a
generator. (If you’re not familiar, a peer dependency is one that is installed
as a sibling).&lt;/p&gt;

&lt;p&gt;So lets indicate generator-common is a peer by appending it to &lt;code&gt;package.json&lt;/code&gt;’s
&lt;code&gt;peerDependencies&lt;/code&gt; block:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-json&quot;&gt;&quot;peerDependencies&quot;: {
  &quot;yo&quot;: &quot;&amp;gt;=1.0.0&quot;,
  &quot;generator-common&quot;: &quot;&amp;gt;=0.2.0&quot;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;Note&lt;/em&gt;, I’ve followed Yeoman’s &lt;a href=&quot;https://github.com/yeoman/yeoman.io/blob/10aca980a4c0d5ea242ed22f3b2af32c95f45eae/app/authoring/composability.md#dependencies-or-peerdependencies&quot;&gt;recommendation&lt;/a&gt; of using a &lt;em&gt;higher or equal
to&lt;/em&gt; version qualifier to prevent conflicts.&lt;/p&gt;

&lt;p&gt;Lets install generator-common and give our generator another spin:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;npm install -g generator-common
yo my-generator
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;All being well, you’ll see Yeoman’s noble face and your generated templates.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;     _-----_
    |       |    .--------------------------.
    |--(o)--|    |   Welcome to the Yeoman  |
   `---------´   |     Common generator!    |
    ( _´U`_ )    '--------------------------'
    /___A___\
     |  ~  |
   __'.___.'__
 ´   `  |° ´ Y `

&lt;/code&gt;&lt;/pre&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;We’ve barely scratched the surface of &lt;code&gt;composeWith&lt;/code&gt;’s potential, but have
covered just enough to get you started. See Yeoman’s &lt;a href=&quot;http://yeoman.io/authoring/composability.html&quot;&gt;composability&lt;/a&gt;
documentation for further information and &lt;a href=&quot;https://github.com/tlvince/generator-my-generator&quot;&gt;tlvince/generator-my-generator&lt;/a&gt;
for this tutorial’s source.&lt;/p&gt;</content><author><name>Tom Vincent</name></author><summary type="html">Yeoman generator v0.17.0 included a useful new feature dubbed composability. If you’ve ever wanted to reuse generators by calling one from another, this is the feature you’ve been waiting for. Here’s a quick overview of how you might use it.</summary></entry><entry><title type="html">AngularJS chained modules</title><link href="https://tlvince.com/angularjs-chained-modules" rel="alternate" type="text/html" title="AngularJS chained modules" /><published>2014-03-11T18:47:52+00:00</published><updated>2014-03-11T18:47:52+00:00</updated><id>https://tlvince.com/angularjs-chained-modules</id><content type="html" xml:base="https://tlvince.com/angularjs-chained-modules">&lt;p&gt;Using &lt;code&gt;var mod = angular.module('MyModule', [])&lt;/code&gt; to declare a module? &lt;em&gt;Don’t&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;As this &lt;a href=&quot;http://plnkr.co/edit/H6WR7iz0tILuOzyejCwL?p=preview&quot;&gt;plunkr&lt;/a&gt; demonstrates, &lt;code&gt;mod&lt;/code&gt; will be accessible on the global scope
(i.e. &lt;code&gt;window.mod&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;Same goes for &lt;code&gt;var ctrl = mod.controller('MyCtrl')&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;As you’ve &lt;a href=&quot;http://yuiblog.com/blog/2006/06/01/global-domination/&quot;&gt;no-doubt heard&lt;/a&gt;, this is a bad idea as anything
on &lt;code&gt;window&lt;/code&gt; can be unwittingly overwritten. As a case in point, try
uncommenting lines 6, then 35 in the aforementioned plunkr and opening up your
browser’s console. &lt;code&gt;window.angular&lt;/code&gt; no-more.&lt;/p&gt;

&lt;p&gt;Unfortunately, Angular’s own documentation give examples in this way, for
example the &lt;a href=&quot;http://docs.angularjs.org/guide/module&quot;&gt;module docs&lt;/a&gt; (correct as of &lt;a href=&quot;https://github.com/angular/angular.js/blob/78165c224d75418bd7721badb8082827e00c4539/docs/content/guide/module.ngdoc#L36-L47&quot;&gt;78165c224d&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Using a “chained” module definition alleviates this problem, such as:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;angular.module('MyModule', []).controller('MyCtrl', function() {})
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If your modules are starting to get large, use the “&lt;a href=&quot;https://github.com/angular/angular.js/blob/78165c224d75418bd7721badb8082827e00c4539/docs/content/guide/module.ngdoc#L201-L218&quot;&gt;module
retrieval&lt;/a&gt;” syntax (omit the dependency array argument) to get
a reference to a previously declared module and continue the module definition
in another file, e.g.:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-js&quot;&gt;angular.module('MyModule', [])

angular.module('MyModule').controller('MyCtrl', function() {})
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;em&gt;Note&lt;/em&gt;: be careful not to pass the dependency array a second time as it will
overwrite the previous module declaration!&lt;/p&gt;</content><author><name>Tom Vincent</name></author><summary type="html">Using var mod = angular.module('MyModule', []) to declare a module? Don’t.</summary></entry><entry><title type="html">Startup programming</title><link href="https://tlvince.com/startup-programming" rel="alternate" type="text/html" title="Startup programming" /><published>2013-11-04T23:30:00+00:00</published><updated>2013-11-04T23:30:00+00:00</updated><id>https://tlvince.com/startup-programming</id><content type="html" xml:base="https://tlvince.com/startup-programming">&lt;p&gt;Earlier in the year, I asked for advice on how to start a programming career in
tech startups. Fast forward eight months; landing four job offers, numerous
freelancing gigs and founding my own consultancy, here’s the advice I was given
that has stuck with me:&lt;/p&gt;

&lt;p&gt;The &lt;em&gt;best&lt;/em&gt; way to get high quality attention these days is by maintaining a
strong GitHub profile and communicating your skills via a blog. Contribute to
any open source projects you love and make that fact public.&lt;/p&gt;

&lt;p&gt;Learn Ruby (and by extension, Ruby on Rails), strengthen your HTML/CSS chops,
know JavaScript inside out; keep your skills fresh, practice and read tech
blogs.&lt;/p&gt;

&lt;p&gt;Eat, sleep and breathe Test Driven and Behaviour Driven Development. It’s the
way of the future for a long, long time.&lt;/p&gt;

&lt;p&gt;Hang out at hacker events; go to conferences, hackathons and workshops. Make
your presence known. People will be beating down your door to hire you.&lt;/p&gt;

&lt;p&gt;Finally (and most importantly), don’t waste your time at companies who don’t
practice pair programming in an agile environment. Pair programming is &lt;em&gt;the&lt;/em&gt;
fastest, most fun way to learn how to be a great programmer. It does wonders
for your communication and teaching skills.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Everyone&lt;/em&gt; needs a programmer that can communicate well with others and
collaborate to get things done.&lt;/p&gt;</content><author><name>Tom Vincent</name></author><summary type="html">Earlier in the year, I asked for advice on how to start a programming career in tech startups. Fast forward eight months; landing four job offers, numerous freelancing gigs and founding my own consultancy, here’s the advice I was given that has stuck with me:</summary></entry><entry><title type="html">Post-PhotoRec Strategies</title><link href="https://tlvince.com/post-photoreq-strategies" rel="alternate" type="text/html" title="Post-PhotoRec Strategies" /><published>2012-12-20T00:00:00+00:00</published><updated>2012-12-20T00:00:00+00:00</updated><id>https://tlvince.com/post-photoreq-strategies</id><content type="html" xml:base="https://tlvince.com/post-photoreq-strategies">&lt;p&gt;If you’ve ever been in the unfortunate situation where your hard disk fails
beyond recognition (&lt;a href=&quot;http://unix.stackexchange.com/questions/33284/recovering-ext4-superblocks&quot;&gt;like mine did&lt;/a&gt;), then you’ve likely come across
a low-level file recovery tool called &lt;a href=&quot;http://www.cgsecurity.org/wiki/PhotoRec&quot;&gt;PhotoRec&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;PhotoRec does a fantastic job of recovering files by matching byte headers with
signatures of known file formats. At the time of writing, it recognises over
&lt;a href=&quot;http://www.cgsecurity.org/wiki/File_Formats_Recovered_By_PhotoRec&quot;&gt;440 file formats&lt;/a&gt;, which covers just about every format you’re likely
to encounter day-to-day.&lt;/p&gt;

&lt;p&gt;However, the challenge &lt;em&gt;after&lt;/em&gt; using PhotoRec is what to do with its output; the
unavoidable result of the data carving technique it uses is that the underlying
directory tree and file names are lost. You are therefore left with a flat-level
tree containing thousands of seemingly nonsensical files with file names such as
&lt;code&gt;f1191548088.txt&lt;/code&gt;… Not particularly useful.&lt;/p&gt;

&lt;p&gt;This post looks at a few approaches you can use to organise the recovered files.&lt;/p&gt;

&lt;h2 id=&quot;sorting-strategies&quot;&gt;Sorting strategies&lt;/h2&gt;

&lt;p&gt;Lets look at a few strategies to sort through the mess:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;#sort-by-file-extension&quot;&gt;Sort by file extension&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#hash-audit&quot;&gt;Hash audit&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#remove-corrupt-files&quot;&gt;Remove corrupt files&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;#rename-using-metadata&quot;&gt;Rename using metadata&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;sort-by-file-extension&quot;&gt;Sort by file extension&lt;/h3&gt;

&lt;p&gt;PhotoRec’s &lt;a href=&quot;http://www.cgsecurity.org/wiki/After_Using_PhotoRec#Sort_files_by_extension&quot;&gt;After Using PhotoRec&lt;/a&gt; wiki page lists a few methods to sort
files after using the tool. The mentioned Python script collates each file by
its file extension. Whilst by no means fully solving the problem, this method
can help in combination with other approaches. Although unlikely, this may also
be of use if the file system in use has a &lt;a href=&quot;http://stackoverflow.com/a/466596&quot;&gt;maximum files per directory
limit&lt;/a&gt;, such as FAT32.&lt;/p&gt;

&lt;h3 id=&quot;hash-audit&quot;&gt;Hash audit&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;http://md5deep.sourceforge.net&quot;&gt;hashdeep&lt;/a&gt;, a program that computes and matches hashsets, has an &lt;em&gt;audit&lt;/em&gt;
function that can compare file hashes against a known set. If you have a
known-good backup, this can be an effective way to determine which files you
already have and then prune them from PhotoRec’s set.&lt;/p&gt;

&lt;h3 id=&quot;rename-using-metadata&quot;&gt;Rename using metadata&lt;/h3&gt;

&lt;p&gt;A fortunate side-effect of using binary formats is that metadata is often saved
alongside its content. Depending on the format, a number of tools can be used to
re-organise the recovered file without reliance on file names.&lt;/p&gt;

&lt;h4 id=&quot;photos&quot;&gt;Photos&lt;/h4&gt;

&lt;p&gt;In the case of photos, we can use the excellent &lt;a href=&quot;http://www.sno.phy.queensu.ca/~phil/exiftool/&quot;&gt;exiftool&lt;/a&gt; to rebuild a
directory tree based based on their timestamp:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;exiftool -r '-FileName&amp;lt;CreateDate' -d %Y/%m/%Y%m%d_%H%M%S%%-c.%%e [files]
&lt;/code&gt;&lt;/pre&gt;

&lt;h4 id=&quot;music&quot;&gt;Music&lt;/h4&gt;

&lt;p&gt;Music can be handled elegantly using &lt;a href=&quot;https://musicbrainz.org/doc/MusicBrainz_Picard&quot;&gt;MusicBrainz Picard&lt;/a&gt;. For a given
audio file, it will use acoustic fingerprinting techniques to generate a hash of
said file and then query it against the MusicBrainz database to determine its
contents.&lt;/p&gt;

&lt;p&gt;Be sure to read through Picard’s &lt;a href=&quot;https://musicbrainz.org/doc/How_to_Tag_Files_With_Picard&quot;&gt;how-to guide&lt;/a&gt;, particularly the
clustering function, which greatly speeds up the querying process. Also, at the
time of writing, the latest release of Picard (v1.2) contains a memory leak
which causes it to hang when dealing with large datasets. Try running the
&lt;a href=&quot;https://github.com/musicbrainz/picard&quot;&gt;development version&lt;/a&gt; (the issue is resolved in pull-requests
&lt;a href=&quot;https://github.com/musicbrainz/picard/pull/143&quot;&gt;#143&lt;/a&gt; and &lt;a href=&quot;https://github.com/musicbrainz/picard/pull/146&quot;&gt;#146&lt;/a&gt;) if you experience this.&lt;/p&gt;

&lt;p&gt;Alternatively, many cloud-based music platforms such as Google Play Music or
iTunes have a “scan and match” feature (using similar fingerprinting
technologies as Picard), which will provide high bitrate, fully-tagged versions
of recognised files available to stream or re-download.&lt;/p&gt;

&lt;h3 id=&quot;remove-corrupt-files&quot;&gt;Remove corrupt files&lt;/h3&gt;

&lt;p&gt;Unfortunately, there isn’t a universal way of determining whether a file is
corrupt. However, depending on the importance of your recovered data, there are
a few approaches worth trying:&lt;/p&gt;

&lt;h4 id=&quot;photos-1&quot;&gt;Photos&lt;/h4&gt;

&lt;p&gt;The &lt;a href=&quot;http://www.pythonware.com/products/pil/&quot;&gt;Python Imaging Library&lt;/a&gt; (PIL) contains a &lt;a href=&quot;http://effbot.org/imagingbook/image.htm&quot;&gt;verify method&lt;/a&gt;
(search for ‘verify’) that should catch obvious corruptions. After installing
PIL, try running Denilson Sá’s &lt;a href=&quot;https://bitbucket.org/denilsonsa/small_scripts/src/96af96e23bc1e19c9156412cdbb8eeba09e21cad/jpeg_corrupt.py&quot;&gt;jpeg_corrupt&lt;/a&gt;, which is a thin
command-line-based wrapper around PIL’s verify method; given a glob of input
paths, it prints the names of those &lt;em&gt;verify&lt;/em&gt; determines as corrupt.&lt;/p&gt;

&lt;h4 id=&quot;musicvideos&quot;&gt;Music/Videos&lt;/h4&gt;

&lt;p&gt;Running &lt;a href=&quot;http://ffmpeg.org&quot;&gt;ffmpeg&lt;/a&gt; without an output file parameter displays information about
the given file. If ffmpeg is unable to parse the file, it’ll spit out a warning,
which can be leveraged to filter and delete corrupt files, e.g.:&lt;/p&gt;

&lt;pre&gt;&lt;code class=&quot;language-bash&quot;&gt;ffmpeg -i &quot;$i&quot; 2&amp;gt;&amp;amp;1 | grep -q 'Invalid data found when processing input' &amp;amp;&amp;amp; rm &quot;$i&quot;
&lt;/code&gt;&lt;/pre&gt;</content><author><name>Tom Vincent</name></author><summary type="html">If you’ve ever been in the unfortunate situation where your hard disk fails beyond recognition (like mine did), then you’ve likely come across a low-level file recovery tool called PhotoRec.</summary></entry><entry><title type="html">Persnickety design</title><link href="https://tlvince.com/persnickety-design" rel="alternate" type="text/html" title="Persnickety design" /><published>2012-12-11T20:44:45+00:00</published><updated>2012-12-11T20:44:45+00:00</updated><id>https://tlvince.com/persnickety-design</id><content type="html" xml:base="https://tlvince.com/persnickety-design">&lt;p&gt;“Web Design is 95% Typography” &lt;a href=&quot;http://informationarchitects.net/blog/the-web-is-all-about-typography-period/&quot;&gt;they say&lt;/a&gt;… and I tend to agree. This
post looks at how improved my site’s typography using a Node.js module and
closes with a remark on CSS hyphenation.&lt;/p&gt;

&lt;h2 id=&quot;typography&quot;&gt;Typography&lt;/h2&gt;

&lt;p&gt;Like &lt;a href=&quot;http://stevelosh.com/blog/2010/09/making-my-site-sing/#finding-a-starting-point&quot;&gt;Steve Losh&lt;/a&gt;, the underlying goal of my site (in terms of design) is
minimalism. I use little-to-no images, a large font and a narrow measure. This
text-centric “&lt;a href=&quot;http://informationarchitects.net/blog/the-web-is-all-about-typography-period/&quot;&gt;text as a user interface&lt;/a&gt;” approach is intended to make
my site a pleasure to read without tools like Readability.&lt;/p&gt;

&lt;p&gt;Behind the curtains, all content of this site is written in Markdown and parsed
as HTML using &lt;a href=&quot;https://github.com/chjj/marked&quot;&gt;marked&lt;/a&gt;. Whilst marked is a fantastic parser, it (currently)
does not support any typographical-enhancing extensions, such as those provided
by &lt;a href=&quot;http://daringfireball.net/projects/smartypants/&quot;&gt;SmartyPants&lt;/a&gt;. Enter &lt;em&gt;typogr.js&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/ekalinin/typogr.js&quot;&gt;typogr.js&lt;/a&gt; is a small Node library with the aim to do one thing and to do it
well: apply transformations on plain text to yield typographically-improved
HTML. It can apply a raft of typographical filters besides those provided by
SmartyPants. See its &lt;a href=&quot;https://github.com/ekalinin/typogr.js#api&quot;&gt;API&lt;/a&gt; for more details.&lt;/p&gt;

&lt;p&gt;After &lt;a href=&quot;https://github.com/ekalinin/typogr.js/pulls/tlvince&quot;&gt;a few&lt;/a&gt; &lt;a href=&quot;https://github.com/ekalinin/typogr.js/pulls/tlvince?direction=desc&amp;amp;page=1&amp;amp;sort=created&amp;amp;state=closed&quot;&gt;patches&lt;/a&gt;, I use typogr.js throughout this
site. Besides smart quotes and correct use of &lt;a href=&quot;http://www.smashingmagazine.com/2011/08/15/mind-your-en-and-em-dashes-typographic-etiquette/&quot;&gt;en- and em-dashes&lt;/a&gt;,
ordinals are styled to match &lt;code&gt;sup&lt;/code&gt; tags (such as those used on a post’s
&lt;a href=&quot;/persnickety-design#date-authored&quot;&gt;authored date&lt;/a&gt;), the imposition of block capitals (such as “API”) is
reduced to match surrounding body text and &lt;a href=&quot;https://en.wikipedia.org/wiki/Widow_(typesetting)&quot;&gt;widows&lt;/a&gt; (lines containing only a
single word) are eliminated through careful placement of &lt;code&gt;&amp;amp;nbsp;&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;a-bleak-aside-on-hyphenation&quot;&gt;A bleak aside on hyphenation&lt;/h2&gt;

&lt;p&gt;As with the last iteration of this site, I was keen to use hyphenation.
Previously, I was using &lt;a href=&quot;https://code.google.com/p/hyphenator/&quot;&gt;hyphenator&lt;/a&gt;, which, all-in-all, works rather well.
However, since this iteration &lt;em&gt;proudly&lt;/em&gt; uses zero Javascript, I preferred a CSS
approach.&lt;/p&gt;

&lt;p&gt;Alas, although CSS3’s hyphenation &lt;a href=&quot;http://caniuse.com/css-hyphens&quot;&gt;works wonderfully in Firefox&lt;/a&gt;,
webkit has yet to catch up. I toyed with enabling it regardless, but as &lt;a href=&quot;https://github.com/h5bp/html5-boilerplate/issues/708#issuecomment-1861631&quot;&gt;Divya
Manian states&lt;/a&gt;, hyphens without justified text
&lt;em&gt;reduces&lt;/em&gt; readability.&lt;/p&gt;

&lt;p&gt;Besides conditionally setting justified text via CSS browser hacks, native
support for hyphenation &lt;em&gt;and&lt;/em&gt; justified text is &lt;em&gt;still&lt;/em&gt; impractical as of 2012.
Lets hope 2013 is the year of the hyphen.&lt;/p&gt;</content><author><name>Tom Vincent</name></author><summary type="html">“Web Design is 95% Typography” they say… and I tend to agree. This post looks at how improved my site’s typography using a Node.js module and closes with a remark on CSS hyphenation.</summary></entry></feed>