This is an ongoing log of interesting bits I pick up along the way.
The formatting is a bit rough and this format won’t scale to years of additional entries, so I’ll setup some organizational scheme as this grows.
Contents
- Contents
- Docker
- Docker file/folder permissions only consider uid/gid, not user/group names
- ‘Reaching around’ Docker to operate inside a container is kind of a pain - so automate as much as possible
- When starting to use Docker, you may make many simple mistakes
- Intermittent host networking oddness with Docker - or, easily find many unexpected bugs with a flaky network
- ElasticSearch in Docker
- Nix
- Ubuntu Linux
- Uber Cadence - workflow & orchestration engine
- Go (Golang)
- Lua
- SQL
- Nginx
- OpenResty
- Rendering a file via lua-resty-template is 3x faster than serving the same file statically via Nginx w/ no Lua code involved (at least for some as-yet-unpublished Nginx configuration)
- OpenResty configuration is parallel and separate from Nginx configuration
- lua-resty-mysql gotcha: NULL MySQL values are represented as ngx.null, not nil - unintuitively (and nil ~= ngx.null)
- Be careful with internal redirects
- Sublime Text
- Command Line
- Bash
- ps
- GitLab
- GitLab CI/CD
- Storycap
- React
- Uber react-vis data visualization component library for React
- react-vis was marked deprecated Jan 2020, then un-deprecated May 2020
- How to build react-vis
- Build error workarounds
- Running the react-vis Storybook
- Rendering react-vis Storybook components to images with Storycap
- Don’t forget to include the CSS to render react-vis visualizations properly
- <Crosshair> default rendering behavior, overriding defaults
- Fonts
- Kibana
- Combo TILs
Docker
Docker file/folder permissions only consider uid/gid, not user/group names
I first encountered this when a volume became unexpected owned by user 100
.
It seemed like a bug, but I brushed it off as a low-level ‘Docker-ism’ that I didn’t understand yet.
This makes sense, because there’s no shared user/group registry across the container and the host OS (or between containers).
‘Reaching around’ Docker to operate inside a container is kind of a pain - so automate as much as possible
The extra level of indirection of typing docker exec
is a pain when you do it 100+ times.
I added scaffolding to my projects to avoid during development:
- mount volumes pointing at source files via
docker-compose
, instead of embedding them in the image - auto-restart scripts when their source changes (e.g., with Python’s
lazarus
module)
With these in place, my workflow became much simpler.
When starting to use Docker, you may make many simple mistakes
As I transitioned from beginner to intermediate level with Docker, I find it a bit like programming in C: very easy to break things with simple mistakes.
Most of the time, the mistakes I made fell into categories:
- not refreshing / recreating something that is cached (an image, a volume, etc.)
- incomplete mental model of Linux process model, environment variables, etc.
Intermittent host networking oddness with Docker - or, easily find many unexpected bugs with a flaky network
After setting up Docker on one machine and creating & stopping many containers, Chrome started acting strangely:
- auto-complete in the Uber-bar would empty itself while typing
- pages would not fully load sometimes (esp. search engine results)
- encountered frequent
err_NETWORK_CHANGED
error pages
The solution was simple, I ran this and then rebooted:
docker network prune
I didn’t get a chance to find the root cause of the problem, but pruning the machine’s Docker nets fixed it.
ElasticSearch in Docker
If you mounted a volume to /usr/share/elasticsearch/data/nodes
and see a long Java exception, then you may have a directory permissions problem:
elasticsearch_1 | OpenJDK 64-Bit Server VM warning: Option UseConcMarkSweepGC was deprecated in version 9.0 and will likely be removed in a future release.
elasticsearch_1 | OpenJDK 64-Bit Server VM warning: UseAVX=2 is not supported on this CPU, setting it to UseAVX=1
elasticsearch_1 | [2020-06-19T04:58:49,463][INFO ][o.e.n.Node ] [] initializing ...
elasticsearch_1 | [2020-06-19T04:58:49,495][WARN ][o.e.b.ElasticsearchUncaughtExceptionHandler] [] uncaught exception in thread [main]
elasticsearch_1 | org.elasticsearch.bootstrap.StartupException: java.lang.IllegalStateException: Failed to create node environment
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:140) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.execute(Elasticsearch.java:127) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.cli.EnvironmentAwareCommand.execute(EnvironmentAwareCommand.java:86) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.cli.Command.mainWithoutErrorHandling(Command.java:124) ~[elasticsearch-cli-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.cli.Command.main(Command.java:90) ~[elasticsearch-cli-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:93) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.main(Elasticsearch.java:86) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | Caused by: java.lang.IllegalStateException: Failed to create node environment
elasticsearch_1 | at org.elasticsearch.node.Node.<init>(Node.java:277) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.node.Node.<init>(Node.java:256) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:213) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:213) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:326) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:136) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | ... 6 more
Here’s the important line of the exception:
elasticsearch_1 | Caused by: java.nio.file.AccessDeniedException: /usr/share/elasticsearch/data/nodes
elasticsearch_1 | at sun.nio.fs.UnixException.translateToIOException(UnixException.java:90) ~[?:?]
elasticsearch_1 | at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:111) ~[?:?]
elasticsearch_1 | at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:116) ~[?:?]
elasticsearch_1 | at sun.nio.fs.UnixFileSystemProvider.createDirectory(UnixFileSystemProvider.java:385) ~[?:?]
elasticsearch_1 | at java.nio.file.Files.createDirectory(Files.java:682) ~[?:?]
elasticsearch_1 | at java.nio.file.Files.createAndCheckIsDirectory(Files.java:789) ~[?:?]
elasticsearch_1 | at java.nio.file.Files.createDirectories(Files.java:775) ~[?:?]
elasticsearch_1 | at org.elasticsearch.env.NodeEnvironment.<init>(NodeEnvironment.java:203) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.node.Node.<init>(Node.java:274) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.node.Node.<init>(Node.java:256) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Bootstrap$5.<init>(Bootstrap.java:213) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Bootstrap.setup(Bootstrap.java:213) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Bootstrap.init(Bootstrap.java:326) ~[elasticsearch-6.4.0.jar:6.4.0]
elasticsearch_1 | at org.elasticsearch.bootstrap.Elasticsearch.init(Elasticsearch.java:136) ~[elasticsearch-6.4.0.jar:6.4.0]
The usual solution is to chown 1000:1000
the local directory – because the uid of the elasticsearch
user inside the ElasticSearch Docker image is 1000
.
But this didn’t work for me.
To fix this problem, I rm
d the Docker container, and then deleted the local Elasticsearch directory altogether.
After starting the Docker image again, the error occurred yet again.
So I stopped the container, changed the permissions on that local folder: chown 1000 elastic-data
, and restarted the Elasticsearch container.
Problem solved.
There’s probably a more elegant solution, but this workaround is good enough for me for now.
Nix
nixpkgs
can update and break previously-building derivations
Found this while learning how to wrap my Python static site generator in a Nix Flake.
After much experimentation, I got nix build
working.
It built my static site generator Python package and stored the entry-point CLI scripts in result/bin/
.
All was well.
An hour passed.
I didn’t change anything - all files were exactly the same (confirmed by git status
).
I ran nix build
again, and the build broke (see the error
below):
$ nix build
warning: Git tree '/home/devon/Projects/ds-ssg' is dirty
warning: updating lock file '/home/devon/Projects/ds-ssg/flake.lock':
• Updated input 'nixpkgs':
'github:nixos/nixpkgs/238db8df98a37821158d71e4ea326c1e42746ce6' (2022-11-12)
→ 'github:nixos/nixpkgs/ee01de29d2f58d56b1be4ae24c24bd91c5380cea' (2022-09-01)
warning: Git tree '/home/devon/Projects/ds-ssg' is dirty
error: Automatic extraction of 'pname' from python package source /nix/store/22dqvarbr9wq3rpd22nmsp2x1q8bxvki-fq53hngqvrg9ir8k4n0glz7v3iqlbs55-source failed.
Please manually specify 'pname'
(use '--show-trace' to show detailed location information)
Searching turned up this GitHub issue: nixpkgs spyder: Automatic extraction of ‘pname’ from python package source /nix/store/a8g2zfzxh3v1hjnxz6839fjkq1caksmr-python3.8-spyder-4.1.5 failed. #207.
The solution was to pin nixpkgs
to the recommended version in the flake.nix
inputs:
nixpkgs.url = "github:nixos/nixpkgs/554d2d8aa25b6e583575459c297ec23750adb6cb";
After pinning nixpkgs
like this, nix build
succeeded.
Ubuntu Linux
How to restart compiz
without tears (tested on Ubuntu 16.04 LTS only)
There’s a linear relationship between uptime
and compiz
instability.
The longer your desktop is running, the less stable compiz
becomes.
Symptoms of an unstable compiz
include: the display not updating correctly, invisible windows, and other visible artifacts.
Searching for a solution, I must have slogged through hundreds of forum threads & Stack Overflow questions – none of them correct.
Finally, I found a way that worked (at least once):
killall compiz
After a few seconds (and screen updates while everything restarted) compiz
restarted automatically (sort of like when you kill dwm.exe
on Windows).
And then everything was as fast & responsive as a fresh login.
In the case that compiz
doesn’t restart automatically, you’ll want to have the following command ready to execute in a new virtual terminal (e.g., ALT-F3
):
DISPLAY=:0 compiz --replace
Note that killing compiz
is a potentially very destructive operation – and you simply won’t have a 100% success rate.
So be exceedingly careful and try this out a few times when you won’t lose any important work, before those times when you really need it (ironically, your need to restart compiz
is highly correlated to having many windows open…).
Uber Cadence - workflow & orchestration engine
How to find the line of code that’s causing a ActivityTaskFailed
event in your history log
NOTE TO SELF: I figured this out a while back, dig through my notes and write this up here.
Go (Golang)
ioutil.WriteFile()
does not call fsync()
!
I ran into a strange issue in a tool I wrote in Go. That tool:
- saves a binary file (“A”) using
ioutil.WriteFile()
- executes another tool that processes file A
– processing a (binary) file saved by ioutil.WriteFile()
with a 3rd party tool didn’t work: the tool reported an error for that file.
When I looked at the file (after the fact!), everything looked fine.
I even ran the command by hand from the command line – it succeeded.
This ‘bug’ has been reported before.
Frankly, it’s surprising the documentation doesn’t mention this. Asynchrony is definitely the big feature with Go, so it’s no surprise ???.
Instead you can use Go’s (more verbose) file writing APIs, which do .
Be careful with struct
member identifier’s capitalization !?
Go struct members are only visible outside the package they are defined if their identifier is capitalized.
While I should have read the first few chapters of an intro to Go book, I jumped into using it to write Uber Cadence.
Instead, I found this out the hard way: I defined a struct
with all lower-case member identifiers, and pulled out much hair trying to figure out why the result
in my ActivityTaskCompleted
events were always empty ({}
in the JSON displayed by the cadence-web
UI).
Lesson: Go has some “foot-knives” (this isn’t the only example).
Lua
Struct-of-arrays instead of array-of-structs
There’s a gotcha waiting before we can even use struct-of-arrays – one that may negate the benefits of using the more convulated struct-of-arrays layout: array initialization.
You’d be forgiven for assuming that just naively creating a new table and populating it as so:
local t = { firstList, secondList }
for i,v in ipairs(anotherList) do
t.firstList[i] = anotherList[i]
t.secondList[i] = someFunction(anotherList[i])
end
But this assumes that certain key operations are O(1):
- insert items
- grow the arrays (/ table) on insert
As it turns out, those operations are not necessarily O(1), and may be O(N) per operation (!?).
Fast(er)(est) Table Inserts in LuaJIT uses some clever tricks, like calling LuaJIT’s table.new()
to pre-allocate a table of a known, fixed size.
Pre-allocating tables
Related to the TIL entry above, there are cases where pre-allocating a table of fixed size is faster than resizing / moving the table around in memory on inserts.
lua-users Wiki: Table Preallocation
LuaJIT’s table.new()
does exactly this.
(TODO See if stock lua 5.1+ does this too…)
SQL
“String interning” for faster string ops with pseudo-hash columns
I ran into a big performance roadblock when storing millions of URLs in a MySQL database.
The url
column was a VARCHAR(2083)
with a UNIQUE
constraint.
Problem was, querying and inserting degraded to a full scan (!) in most cases.
The solution is a clever hack using a so-called “pseudo-hash” column that’s derived from the URL.
This is a form of string intering that works within SQL’s limitations.
This trick is documented in “High Performance MySQL”, in the part of chapter 5 that talks about hash indexes.
Intelligent Database Design Using Hash Keys argues that hash collisions with even a 32-bit hash key are acceptable for a 50M row table – since the O(N) scan during queries will have very small N. But this probability table indicates otherwise: with as few as 110,000 rows hashed into a 32-bit hash key column, a new row has a 3/4 chance of collision. Which isn’t terrible, but the N factor in the O(N) scan will definitely grow as more rows are added, yielding a constant overhead for any INSERT or SELECT.
5 Ways to Make Hexadecimal Identifiers Perform Better on MySQL explains that you’ll hit about 100% probability of hash collision with as few as 100K rows when using a 32-bit hash key.
Hash-based Workarounds for MySQL Unique Constraint Limitations shows how to workaround the size limit in MySQL’s that prevents a UNIQUE index above a certain size: computing and storing a shorter hash of all column values (!), and making the hash column UNIQUE.
Using MySQL 5.7 Generated Columns to Increase Query Performance shows how to use MySQL 5.7’s new “virtual columns” ???
A hash-based GROUP BY strategy for MySQL shows how to use a 128-bit BINARY(16)
hash column to index a 36-million-row table.
On the topic of what datatype / column-‘width’ to use for the hash key:
MySQL and Binary(16) – The Reasons/Benefits/Drawbacks reports that looking up a string by its 16-byte hash key takes 0.0019 seconds, even when looking up 100 strings simultaneously via a SELECT ... WHERE ... IN (<list of hash keys>)
query.
MySQL – Binary(16) and scalability talks about optimizing joins with 128-bit hash keys.
Some versions of MySQL have an ‘open file descriptor leak’
Under certain conditions (complex joins? temporary tables?), some versions of MySQL will keep temp table file descriptors open indefinitely.
This will eventually (or quickly!) exhaust the total available MySQL server file descriptors.
To see if this is occurring on your machine: First, get the process ID of your running MySQL server. Then, run the following command to see the open file descriptors:
sudo ls -la /proc/<mysql server pid>/fd
If you see thousands of open file descriptors that look like this (taken from this SO question):
lrwx------ 1 mysql mysql 64 Apr 17 08:56 990 -> /tmp/mysql_temptable.xTHQV4 (deleted)
lrwx------ 1 mysql mysql 64 Apr 17 08:56 991 -> /tmp/mysql_temptable.gr1swq (deleted)
lrwx------ 1 mysql mysql 64 Apr 17 08:56 992 -> /tmp/mysql_temptable.sXackV (deleted)
lrwx------ 1 mysql mysql 64 Apr 17 08:56 993 -> /tmp/mysql_temptable.Tom8Pa (deleted)
lrwx------ 1 mysql mysql 64 Apr 17 08:56 994 -> /tmp/mysql_temptable.OqNhMl (deleted)
lrwx------ 1 mysql mysql 64 Apr 17 08:56 995 -> /tmp/mysql_temptable.VOlk8X (deleted)
lrwx------ 1 mysql mysql 64 Apr 17 08:56 996 -> /tmp/mysql_temptable.ti1nry (deleted)
lrwx------ 1 mysql mysql 64 Apr 17 08:56 997 -> /tmp/mysql_temptable.EeXTiS (deleted)
lrwx------ 1 mysql mysql 64 Apr 17 08:56 998 -> /tmp/mysql_temptable.r2GHks (deleted)
lrwx------ 1 mysql mysql 64 Apr 17 08:56 999 -> /tmp/mysql_temptable.NDCeta (deleted)
Then you’re hitting this bug.
One solution is to upgrade MySQL server to a more recent version. I was running MySQL 8.0.15, and upgrading to 8.0.20 fixed the problem.
Nginx
I could probably write an entire page of TILs about Nginx, but I started writing this log late so there’s only a few TILs in this section.
sendfile
is off
by default
open_file_cache
is off
by default
Like most folks hearing that Nginx is (was?) the fastest web server, I assumed several things about its performance:
- serving static files was the fast_est_-path, by default
- there is nothing faster than serving a static file [1]
- files are cached in-memory, by default
It turns out all three of those assumptions are wrong.
Despite its name, open_file_cache
does not actually cache the contents of files, but instead it caches metadata about files (open file descriptors, stat sizes & modification times, directory existence, etc.).
[1] excluding something slightly more exotic, like a location
configured to immediately emit a string from memory.
OpenResty
Rendering a file via lua-resty-template
is 3x faster than serving the same file statically via Nginx w/ no Lua code involved (at least for some as-yet-unpublished Nginx configuration)
When I first saw this, I was quite surprised!
As described in the lua-resty-template
benchmarks notes:
Others have reported that in simple benchmarks running this template engine actually beats Nginx serving static files by a factor of three. So I guess this engine is quite fast.
Uh…that is entirely the opposite of the assumption I’ve been operating under: serving static files is possibly an order of magnitude faster than serving content via Lua code (and rendering a template via Lua code is even slower than that).
In the linked issue, Bungle (the package’s author) himself was surprised by this reported result:
@Pronan, it feels a bit strange for template+routing to be faster than serving a static file. But I’m not complaining. You have not presented your configs and code, so it is hard for me to judge anything. Have you used open_file_cache on Nginx? Using cached lua-resty-template you are not using any file io, and that can explain something. You could put the file on memory backed tmpfs also.
Some possible explanations for why Lua code would be faster than serving a static file:
- static file serving hits disk, Lua code is cached in-memory with no disk I/O (
lua_code_cache
) - and a wild-ass guess: static file serving is reading file metadata, e.g., to determine MIME type (no evidence of this, just a possibility)
OpenResty configuration is parallel and separate from Nginx configuration
You might think that OpenResty configuration lives in the same place as Nginx’s configuration (e.g. /etc/nginx/
on Ubuntu).
…because OpenResty is based on Nginx.
But that is not the case. OpenResty stores its configuration files, Lua packages, etc. in a separate folder, e.g., /usr/local/openresty/
on Ubuntu.
So OpenResty does not read any of Nginx’s configuration files (from /etc/nginx/
), and vice versa.
lua-resty-mysql
gotcha: NULL
MySQL values are represented as ngx.null
, not nil
- unintuitively (and nil
~= ngx.null
)
The lua-resty-mysql
library represents NULL
values from MySQL as ngx.null
, and not as nil
.
It’s an unintuitive but deliberate design decision.
If you didn’t know this, one breadcrumb that might lead you to figure it out is the type of the NULL
column-value:
> print(type(rows[1].some_column_whose_value_is_null))
'userdata'
If you expected to see nil
, then this is a clue that may help lead you to enlightenment.
Be careful with internal redirects
The important variables are:
ngx.var.uri
ngx.var.request_uri
ngx.var.request_method
The ngx.var.uri
and ngx.var.request_uri
are different for internal redirects.
If you’re using a Lua-based router to dispatch requested URI to Lua methods, you’re probably switching on ngx.var.request_method
.
That’s fine for normal usage, but has a problem when using internal redirects: the request method does not change (and cannot be changed?) when calling ngx.exec()
.
That’s a problem when we want to ngx:exec()
from a POST handler to invoke a GET handler.
The router switches on ngx.var.request_method
, but that will still be “POST” after calling ngx.exec()
.
To see why this is, consider that of the Nginx variables in the list above, ngx.exec()
only affects one: ngx.var.uri
.
All the other variables remain unchanged.
As far as I can tell, there’s no way to change the request method for an internal redirect. A workaround is to make your routes more permissive, e.g., make a GET-only route accept POSTs too. It’s not clean, I know, but it works.
Another option is to implement “internal internal redirection” (^_^) in your router, and just call the handler yourself.
Sublime Text
Custom Build System
I was running the luacheck
linter and had 100s of warnings to handle.
My monitor is small, so I was alt-tabbing between the console and Sublime Text to edit files. Even if my monitor were large, jumping to the correct file & line number 100s of times isn’t very appealing.
So I searched and discovered that Sublime Text lets you setup a custom ‘build system’ to act like a full-fledged IDE.
When it’s configured correctly, this lets you double-click tool outputs (in this case, generated by luacheck
) and jump straight to the file, line number, and even the column where the error occurred, with error messages displined inline in the source code.
Wonderful!
I added a build system to my ST project file, with a custom file_regex
for luacheck
(on Linux) looks like this:
"file_regex": "^[ ]*([^:]*)[:]([^:]*)[:]([^:]*)[:](.*)$",
Note that luacheck
may output relative paths, in which case you should also specify the "working_dir"
Also note that debugging a faulty build system configuration is a real pain.
Small misconfigurations or regex errors will result in strange behavior, like double-clicking a result line opening an empty file (with the correct filename in the tab) with no contents, but ‘inline’ errors visible.
In my case, I initially fudged the leading whitespace syntax in my file_regex
.
I ‘debugged’ this by hovering my mouse over the opened file’s tab and noticing (incorrect) additional whitespace in the file’s path.
I love Sublime Text, but quite honestly this is one of my favorite things about it. It’s an example of Sublime’s ‘minimal but functional and flexible’ ethos.
Command Line
Highlight stderr
lines in red (June 12, 2020)
Recently I needed to see errors in a dump from a very verbose Python script.
While there are magic incantations to do this the hard way, I wanted a tool I could just prepend to my command line and see stderr
lines highlighted in red.
highlight-stderr is that tool, and it’s brilliant – simple, tiny, useful, and written in Rust.
You can install highlight-stderr
thusly:
cargo install highlight-stderr
And then run the command whose errors you want to highlight:
highlight-stderr bash -c 'ls none; echo stdout; this-command-does-not-exist'
Bash
Saving separate history files (per terminal / tab)
I run with a lot of open terminal tabs & tmux windows.
But Bash overwrites ~/.bash_history
when any terminal exits (by default).
Which makes it impossible to retrace my steps in different projects.
There are several automated solutions, but really all I needed was to apply some hygiene before exit
ing each terminal:
history > .bash_history.NNN
Where NNN
is an increasing number, padded with zeros (so the history files sort correctly).
I run that command in the root of each project (when I remember to!), and it keeps a log of commands unique to that project.
It’s OK that it’s messy - I use fzf
, grep
, etc. to filter out what I really need.
And I started saving a .bash_greatest_hits
file in each project, to remind myself of the most important / frequently-run commands.
For example: source ./venv/bin/activate && scripts/start-server.sh
usually goes at the top of that file, along with build & deployment commands.
The combination of (manually-generated) .bash_history.NNN
files and a (manually-curated) .bash_greatest_hits
file lets me keep track of all commands I’ve run on a project, and promote the most useful ones for visibility.
It’s a simple but quite useful habit.
Start multiple processes in the background, wait for them all to finish, and kill them all on CTRL+C
At the top of your script, register a trap
to capture CTRL+C:
trap "killall" INT
killall() {
trap '' INT TERM # ignore INT and TERM while shutting down
echo "**** Shutting down... ****"
kill -TERM 0 # fixed order, send TERM not INT
wait
echo DONE
}
The killall()
function calls kill -TERM 0
, which kills the child processes of the calling script – i.e., the background processes the script spawns.
At the end of your script, simply wait
for all the background processes to finish:
wait
The script will run until all the background processes finish, or you press CTRL+C, whichever comes first.
Functions can run in the background too (!?)
You can invoke a Bash function with &
just like a process (!?).
I’ve used this trick to do the inotifywait
-make
-while-loop trick in a function, and run that function in parallel, alongside other build processes (also wrapped in functions which are invoked with &
).
ps
Get extended info for processes in current TTY
By default, ps
does not show the extended command line for each process – which makes it useless to figure out which of a handful of python3
processes is the one you want to halt.
Habitually, many of us just reach for ps aux
and grep
to filter, but there’s a better way: ps -t
shows extended info for only those processes running in the current terminal/TTY.
GitLab
Running GitLab container registry on an HTTP GitLab instance, and successfully loging into it
(This tip assumes you installed GitLab via the Omnibus installation method.)
Running GitLab over HTTP is not recommended. Same goes for running a Docker registry over HTTP. As such, this should be a rare (and dangerous!) combo. I only use this for quick testing on ephemeral instances when I don’t have time to fuss with TLS settings on local machines that aren’t publicly exposed.
First, enabled the GitLab container registry by editing /etc/gitlab/gitlab.rb
.
The bits below are the only changes I made.
Replace YOUR-GITLAB-HOST
with…your GitLab’s hostname.
...
################################################################################
## Container Registry settings
##! Docs: https://docs.gitlab.com/ce/administration/container_registry.html
################################################################################
registry_external_url 'http://YOUR-GITLAB-HOST:5001/'
### Settings used by GitLab application
gitlab_rails['registry_enabled'] = true
gitlab_rails['registry_host'] = "registry.YOUR-GITLAB-HOST"
gitlab_rails['registry_port'] = "5005"
gitlab_rails['registry_path'] = "/var/opt/gitlab/gitlab-rails/shared/registry"
...
On the machine where you’ll be logging in from, edit /etc/docker/daemon.json
(or the equivalent for your platform) to mark the GitLab container registry as insecure:
{
"insecure-registries" : ["YOUR-GITLAB-HOST:5001"]
}
Then login to the GitLab container registry that you just configured:
docker login http://YOUR-GITLAB-HOST:5001
If all went well, you should see:
Login Succeeded
If login did not succeed, you should try accessing the container registry via your browser, e.g., when you navigate to http://YOUR-GITLAB-HOST:5001/
, it should show a blank page (and HTTP 200
response status).
If you see an Nginx error, or an HTTP 4XX
response code, then something’s misconfigured.
Try different settings in your /etc/gitlab/gitlab.rb
- e.g., change port numbers, etc.
You could also do a sanity test to see if nginx
is indeed running at the expected port (5001
for the configuration above) by running:
sudo netstat -nlp | grep -i 5001
When all is working, you should see something like:
tcp 0 0 0.0.0.0:5001 0.0.0.0:* LISTEN #####/nginx
GitLab CI/CD
Skip a CI/CD build with [skip ci]
commit message
You can tell GitLab CI/CD (and several other CI/CD tools) to skip a CI build with a special commit message. This feature is useful when you want to commit code but not trigger the CI pipeline:
- Update
README.md
- Update (internal-only) docs
- Commit internal repo book-keeping changes (bump version number, etc.)
Several go-semrel-gitlab
commands use a [skip ci]
commit messages, by default.
That way, you can bump the project version without invoking a CI pipeline build.
There are drawbacks with [skip ci]
, though.
Notes on go-semrel-gitlab‘s .gitlab-ci.yml
example
That CI/CD pipeline example is useful for understanding how to apply go-semrel-gitlab
in your own project.
But it’s a bit complex to decipher what exactly the release
commands are doing in there, and how they interact with the other bits.
Here’s an outline of the release automation parts of that pipeline:
version
stage:release next-version
computes the next version number, and stores it in the file.next-version
build
stage: compiles the Go package which contains the contents of.next-version
(among other things)image
stage: (nogo-semrel-gitlab
commands are executed in this stage)release
stage: this stage is split into several jobs, all of which are manually-triggered- only runs on branches
- pre-release-image: (no
go-semrel-gitlab
commands executed here) - pre-release:
release commit-and-tag
,release add-download-link
- pre-release-image: (no
- only on
master
- release:
release changelog
,release commit-and-tag
,release add-download-link
- pages: (no
go-semrel-gitlab
are commands executed here)
- release:
The release
stage does all the (external) mutations: commit / push to Git, add “release assets” URL to the release, etc.
Storycap
Storycap can capture (some) built & hosted websites
Instead of git clone
ing and setting up Storybook from repos, you can save time by capturing an already-live Storybook deployment.
For example, you can screenshot the components in the Vue kitchen sink Storybook (give it a minute, it’s sometimes slow to load):
npx storycap https://next--storybookjs.netlify.app/vue-kitchen-sink/ -o __screenshots__foo
React
Higher-Order Components and hooks
If you try to write an HOC like this (totally contrived example):
export default withSomething = aBoolProp => {
useEffect((aBoolProp) => {
console.log(aBoolProp);
}, [aBoolProp]);
return (
<div>{aBoolprop}</div>
);
};
Then you’ll get an error ??? about this being invalid to use a hook here TODO specific error message.
The problem is that you’ve indicated you’re writing an HOC with the with
prefix, but ???
One solution is to refactor this slightly to get a proper hooks-compatible HOC:
export default withSomething = (comp, aBoolProp) => {
return props => {
useEffect((aBoolProp) => {
console.log(aBoolProp);
}, [aBoolProp]);
return (
<div>
<comp {...props} />
</div>
);
};
};
Uber react-vis data visualization component library for React
react-vis
was marked deprecated Jan 2020, then un-deprecated May 2020
Before considering using react-vis
, you should read this entire GitHub issue thread to get a sense of the project’s long-term status (still TBD) and some great assessments of react-vis
‘s limitations given the state of the art in 2020.
Andrew McNutt had this to say (emphasis mine):
There’s an increasing trend in visualization libraries to shift away from imperative visualization declaration (e.g. put this circle here) to declarative declarations (gimmie an x/y plot with circles). While react-vis has always been a little towards the latter it doesn’t feature the rich grammar of graphics style declarations that libraries like ggplot or vega-lite/altair feature. It’s hard to see it go, but it’s also very much a visualization library of a particular era (specifically the react-ify-everything-stylings of 2015-6). I definitely still find myself using react-vis a reasonable amount, but more and more I also find myself using vega-embed/react-vega and the like.
Another limitation of react-vis
is its retro approach to styling:
One main sticking point for deprecating this library is that we don’t have a sound and modern strategy for styling in today’s React ecosystem.
There’s a desire to update react-vis
to use the more modern, React-friendly Styletron instead of the current old-school styling approach.
react-vis is uses scss which is no longer a best practice nor is it tightly compatible with these new libraries/frameworks which are widely used at Uber.
Apparently Uber devs are increasingly using other visualization libraries instead of react-vis
to: ECharts, Nivo, and Highcharts among others.
How to build react-vis
As of June 2020, I couldn’t get the master
branch to build, despite several attempts.
Mostly I ran into issues with yarn install
or the various run
scripts failing.
I was using Node v14 and ran into waves of errors.
After installing Node v11 (via nvm
), I could build react-vis
successfully.
I used Philip Peterson’s fork at commit 699ff938807600924878ef3e2a79d98c45ca51a9.
Build error workarounds
When you yarn install
, you may encounter problems fetching or building dependencies.
I found the following issues on Ubuntu 18.04 with Node 14.2.0.
If you see any errors about canvas-prebuilt
, you can get them via Philip Peterson’s use-node-canvas-stock
branch from his forked repo.
That branch replaces the canvas-prebuilt
dependency with plain-old canvas
.
If you see an error like this:
Package cairo was not found in the pkg-config search path.
Perhaps you should add the directory containing `cairo.pc'
to the PKG_CONFIG_PATH environment variable
No package 'cairo' found
gyp: Call to 'pkg-config cairo --libs' returned exit status 1 while in binding.gyp. while trying to load binding.gyp
Then you need to install the cairo
binaries:
sudo apt-get install libcairo2-dev libjpeg-dev libgif-dev
If you see low-level V8 / C++ compiler errors, then you may need to downgrade Node to v11 (yes, v11).
Assuming you’ve already installed nvm
:
nvm install 11
nvm use 11
Running the react-vis
Storybook
(This tip assumes you’re using Node v11 and Philip Peterson’s react-vis
fork at commit 699ff938807600924878ef3e2a79d98c45ca51a9)
First, build react-vis
(using Node v11):
npm install
npm build
Then build & run the react-vis
Storybook:
cd website
npm install
npm build-storybook
npm storybook
Finally, visit http://localhost:9001 to see the react-vis
Storybook.
Rendering react-vis
Storybook components to images with Storycap
First, install Storycap:
npm install storycap
Then, run the react-vis
Storybook server (as described in a previous tip)
Finally, run storycap
to screenshot the components from Storybook:
npx storycap
Alternatively, if you don’t want to build react-vis
locally, you can run storycap
on a public hosted instance of the react-vis
Storybook:
npx storycap https://uber.github.io/react-vis/website/dist/storybook
Don’t forget to include the CSS to render react-vis
visualizations properly
If your react-vis visualizations aren’t rendering properly, you might not have included the main react-vis
stylesheet.
Symptoms of improper rendering include any of the following:
- graphs that don’t line up with axes
- missing grid lines (if applied to your visualization)
<Crosshair>
s that don’t render (at all)- solid black filled line graphs
Specific rendering artifacts will depend on what features you’re using.
You can import the react-vis
stylesheet via JavaScript:
import '../node_modules/react-vis/dist/style.css';
Or import it via SASS instead:
@import "~react-vis/dist/style";
For example, the react-vis simple chart on Codepen references the external react-vis.css stylesheet.
<Crosshair>
default rendering behavior, overriding defaults
For both Canvas & SVG modes, <Crosshair>
renders as an absolutely-positioned <div>
that is overlaid on top of the visualization.
That <div>
contains child <div>
s that (a) display a vertical line at the current x coordinate, and (b) display the current values
(unless you override the default content).
You can override the default ‘toolip’ contents by providing your own nodes inside the <Crosshair>
component in your JSX code.
Fonts
Some nice modern fonts for the web
The set of widely-agreed-upon “good fonts” is ever-changing, a bit like fashion.
Here’s some I came across in the late 2010s that I like:
- Muli is a lovely open source font used by Cypress’s integration testing app frontend.
- Proxima Nova is a nice one for commercial work. It has an extended version with monospaced numbers for displaying numerical tables, etc.
Kibana
Exporting log data from Kibana
TODO screenshots, links to instructions
- Name your query
- Click “share”, then “generate CSV”
Sometimes this won’t succeed (the button might be disabled), but if you keep trying it should work.
Combo TILs
The TILs in this section combine multiple tools in one recipe.
Find & edit files, one at a time
I needed to make small changes to a dozen files in many directories with a common pattern to their filename.
But you can’t simply do a: find <args> | xargs nano
– nano exits immediately with SIGUP / SIGTERM.
You can simply change the syntax, and do: nano $(find <args>)
.
Or you can do a funky syntax like find <args> -exec nano \;
:
See: Passing file directly to nano fails with sighup or sigterm
Viewing & filtering Parquet files with jq
and parquet-tools
To pretty-print the JSON:
parquet-tools cat --json file.parquet | jq
To filter the JSON (for example):
parquet-tools cat --json file.parquet | jq ".field"
Getting data into Mode via S3 and Redshift
Assuming your Mode app is already connected to a Redshift database:
- Convert your data to CSV (e.g., from JSON)
- Upload your CSV file to S3 (via command line or S3 console)
- Create your table, and run a query in Redshift to
COPY
the CSV file from S3, passing the rightiam_role
When the data is in a Redshift table, you can query it from Mode.