👏

A Regex Saved Me Enough Time To Write A Blog Post About It

Tom Steavenson - Engineering Manager

So today I’ve been adding Open API Spec linting to our CI pipeline and fixing any errors I found... what could possibly go wrong? 😊

After shopping around through a few linters (there are a few), I settled on Spectral as a linter that gives you more than one error at once, line numbers, and the ability to endlessly tweak linting config if there was any “special exemptions” we need.

So I had a go at running it locally:

image

Whoops... 😬

OK so it turns out that 300 or so of these errors are due to multiple instances of the same issue.

2781:9    error  parser                     
  Mapping key must be a string scalar rather than number            
  paths./api/v2-beta/model-version-instance/{instance_id}.delete.responses[200]

In English: YAML type error. Where we have something like this:

responses:
  200:
    description: OK

We want it to be this instead:

responses:
  "200":
    description: OK
    ...

So the key is read as a string.

This still leaves me with one issue. I need to go and make this change 300 times... 🤔

Not to fear, sed is here.

sed -i 's/\([0-9]\{3,3\}\):/"\1":/g' api-spec-v2.tmpl.yml

What?

In English:

  • Make these changes in place, sed -i , to file api-spec-v2.tmpl.yml
  • Wherever you find a 3 digit number followed by a colon, put quotes around that number.

But how?

The leading s/ tells sed to perform a substitution, and the final /g instructs sed to “do it everywhere”.

The regex expression is of the form s/<matching regex>/<substitution>/g.

Looking at the matching regex:

[0-9] matches digits (characters between 0 & 9). \{3,3\} says to match digits a minimum of 3 and a maximum of 3 times, a 3 digit number. The \( , \) that surrounds this tells sed that the match inside is a “group” (I’ll come back to that).

Then the following : states that a 3 digit number must be followed by a colon.

Looking at the substitution:

We’ve got the quotes and the colon, and in between the quotes \1 says to copy in the first match group (the thing surrounded by \( & \)).

Of course this matching regex isn’t all that robust... Something like response_was_200: true might suddenly find itself getting turned into response_was_"200": true, which is no good. Before trying to develop the regex any further, we can test what matches it has in the file.

grep '\([0-9]\{3,3\}\):' api-spec-v2.tmpl.yml

The output was far too verbose to paste in here, but take it from me, the only matches were ones that we cared to change!

So there we go! 300 errors fixed in a few seconds, and... A Regex Saved Me Enough Time To Write A Blog Post About It.