Get notified when we add new posts  

People at a Startups & Beer event

Our Spark cluster setup

Running our own Spark cluster has proved to be a valuable experience for us.

Read

Sortableeps having lunch

BRIN indexes in Postgres 9.5

We recently had a chance to explore one of Postgres 9.5’s new features: block range indexes.

Read

Fuzzy and incomprehensible graph on a whiteboard.

Rolling up lint.

Our experience making linting into a standard part of our build.

Read

The hiring committee staring awkwardly at each other

How we hire: A roundtable discussion

Hiring is hard. We assembled our engineering hiring committee — senior developers Mark, Colin, Graeme, and Steven — for an open discussion on what goes on behind the scenes when you apply for a position at Sortable.

Read

Office scene with Rubik's cube

Spark Performance: Is the DirectParquetOutputCommitter really better?

Why can the DirectParquetOutputCommitter be more efficient? How much more efficient? Are there any downsides?

Read

Steven explaining something

Compromising with legacy code: an anecdote

Last year I was making a change to some of our legacy code and learned a bit about sacrificing technical correctness for other, less technical, goals in the process.

Read

Sampling Profiler in Bash

Building a sampling profiler with 30-year-old technology

What if you need to diagnose this performance issue in a non-invasive manner? I recently found myself in this situation. Here’s my solution: a sampling profiler.

Read

Spark Performance Optimization

Improving Spark performance with repartitioning

One of the tricks that we’ve found to improve performance of Spark jobs is to change the partitioning of our data. We’ll illustrate how this can help with a simple example.

Read

An excerpt from the Scala language spec

Why you can't always flatMap to Option

A little while ago, something that had always worked for me in Scala suddenly didn’t work. Here’s what happened.

Read