fix incorrect ToC heading levels

This commit is contained in:
John Bowdre 2021-12-09 19:30:35 -06:00
parent be754e7cb1
commit 74e5e2ced9
2 changed files with 16 additions and 16 deletions

View file

@ -10,38 +10,38 @@ title: Finding the most popular IPs in a log file
I found myself with a sudden need for parsing a Linux server's logs to figure out which host(s) had been slamming it with an unexpected burst of traffic. Sure, there are proper log analysis tools out there which would undoubtedly make short work of this but none of those were installed on this hardened system. So this is what I came up with.
#### Find IP-ish strings
### Find IP-ish strings
This will get you all occurrences of things which look vaguely like IPv4 addresses:
```shell
grep -o -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' ACCESS_LOG.TXT
```
(It's not a perfect IP address regex since it would match things like `987.654.321.555` but it's close enough for my needs.)
#### Filter out `localhost`
### Filter out `localhost`
The log likely include a LOT of traffic to/from `127.0.0.1` so let's toss out `localhost` by piping through `grep -v "127.0.0.1"` (`-v` will do an inverse match - only return results which *don't* match the given expression):
```shell
grep -o -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' ACCESS_LOG.TXT | grep -v "127.0.0.1"
```
#### Count up the duplicates
### Count up the duplicates
Now we need to know how many times each IP shows up in the log. We can do that by passing the output through `uniq -c` (`uniq` will filter for unique entries, and the `-c` flag will return a count of how many times each result appears):
```shell
grep -o -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' ACCESS_LOG.TXT | grep -v "127.0.0.1" | uniq -c
```
#### Sort the results
### Sort the results
We can use `sort` to sort the results. `-n` tells it sort based on numeric rather than character values, and `-r` reverses the list so that the larger numbers appear at the top:
```shell
grep -o -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' ACCESS_LOG.TXT | grep -v "127.0.0.1" | uniq -c | sort -n -r
```
#### Top 5
### Top 5
And, finally, let's use `head -n 5` to only get the first five results:
```shell
grep -o -E '[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}' ACCESS_LOG.TXT | grep -v "127.0.0.1" | uniq -c | sort -n -r | head -n 5
```
#### Bonus round!
### Bonus round!
You know how old log files get rotated and compressed into files like `logname.1.gz`? I *very* recently learned that there are versions of the standard Linux text manipulation tools which can work directly on compressed log files, without having to first extract the files. I'd been doing things the hard way for years - no longer, now that I know about `zcat`, `zdiff`, `zgrep`, and `zless`!
So let's use a `for` loop to iterate through 20 of those compressed logs, and use `date -r [filename]` to get the timestamp for each log as we go:

View file

@ -7,22 +7,22 @@ tags:
- serverless
title: Free serverless URL shortener on Google Cloud Run
---
#### Intro
### Intro
I've been [using short.io with a custom domain](https://twitter.com/johndotbowdre/status/1370125198196887556) to keep track of and share messy links for a few months now. That approach has worked very well, but it's also seriously overkill for my needs. I don't need (nor want) tracking metrics to know anything about when those links get clicked, and short.io doesn't provide an easy way to turn that off. I was casually looking for a lighter self-hosted alternative today when I stumbled upon a *serverless* alternative: **[sheets-url-shortener](https://github.com/ahmetb/sheets-url-shortener)**. This uses [Google Cloud Run](https://cloud.google.com/run/) to run an ultralight application container which receives an incoming web request, looks for the path in a Google Sheet, and redirects the client to the appropriate URL. It supports connecting with a custom domain, and should run happily within the [Cloud Run Free Tier limits](https://cloud.google.com/run/pricing).
The Github instructions were pretty straight-forward but I did have to fumble through a few additional steps to get everything up and running. Here we go:
#### Shortcut mapping
### Shortcut mapping
Since the setup uses a simple Google Sheets document to map the shortcuts to the original long-form URLs, I started by going to [https://sheets.new](https://sheets.new) to create a new Sheet. I then just copied in the shorcuts and URLs I was already using in short.io. By the way, I learned on a previous attempt that this solution only works with lowercase shortcuts so I made sure to convert my `MixedCase` ones as I went.
![Creating a new sheet](/images/posts-2021/08/20210820_sheet.png)
I then made a note of the Sheet ID from the URL; that's the bit that looks like `1SMeoyesCaGHRlYdGj9VyqD-qhXtab1jrcgHZ0irvNDs`. That will be needed later on.
#### Create a new GCP project
### Create a new GCP project
I created a new project in my GCP account by going to [https://console.cloud.google.com/projectcreate](https://console.cloud.google.com/projectcreate) and entering a descriptive name.
![Creating a new GCP project](/images/posts-2021/08/20210820_create_project.png)
#### Deploy to GCP
### Deploy to GCP
At this point, I was ready to actually kick off the deployment. Ahmet made this part exceptionally easy: just hit the **Run on Google Cloud** button from the [Github project page](https://github.com/ahmetb/sheets-url-shortener#setup). That opens up a Google Cloud Shell instance which prompts for authorization before it starts the deployment script.
![Open in Cloud Shell prompt](/images/posts-2021/08/20210820_open_in_cloud_shell.png)
@ -31,14 +31,14 @@ At this point, I was ready to actually kick off the deployment. Ahmet made this
The script prompted me to select a project and a region, and then asked for the Sheet ID that I copied earlier.
![Cloud Shell deployment](/images/posts-2021/08/20210820_cloud_shell.png)
#### Grant access to the Sheet
### Grant access to the Sheet
In order for the Cloud Run service to be able to see the URL mappings in the Sheet I needed to share the Sheet with the service account. That service account is found by going to [https://console.cloud.google.com/run](https://console.cloud.google.com/run), clicking on the new `sheets-url-shortener` service, and then viewing the **Permissions** tab. I'm interested in the one that's `############-computer@developer.gserviceaccount.com`.
![Finding the service account](/images/posts-2021/08/20210820_service_account.png)
I then went back to the Sheet, hit the big **Share** button at the top, and shared the Sheet to the service account with *Viewer* access.
![Sharing to the service account](/images/posts-2021/08/20210820_share_with_svc_account.png)
#### Quick test
### Quick test
Back in GCP land, the details page for the `sheets-url-shortener` Cloud Run service shows a gross-looking URL near the top: `https://sheets-url-shortener-vrw7x6wdzq-uc.a.run.app`. That doesn't do much for *shortening* my links, but it'll do just fine for a quick test. First, I pointed my browser straight to that listed URL:
![Testing the web server](/images/posts-2021/08/20210820_home_page.png)
@ -47,14 +47,14 @@ This at least tells me that the web server portion is working. Now to see if I c
Hmm, not quite. Luckily the error tells me exactly what I need to do...
#### Enable Sheets API
### Enable Sheets API
I just needed to visit `https://console.developers.google.com/apis/api/sheets.googleapis.com/overview?project=############` to enable the Google Sheets API.
![Enabling Sheets API](/images/posts-2021/08/20210820_enable_sheets_api.png)
Once that's done, I can try my redirect again - and, after a brief moment, it successfully sends me on to Polywork!
![Successful redirect](/images/posts-2021/08/20210820_successful_redirect.png)
#### Link custom domain
### Link custom domain
The whole point of this project is to *shorten* URLs, but I haven't done that yet. I'll want to link in my `go.bowdre.net` domain to use that in place of the rather unwieldy `https://sheets-url-shortener-vrw7x6wdzq-uc.a.run.app`. I do that by going back to the [Cloud Run console](https://console.cloud.google.com/run) and selecting the option at the top to **Manage Custom Domains**.
![Manage custom domains](/images/posts-2021/08/20210820_manage_custom_domain.png)
@ -67,13 +67,13 @@ The wizard then tells me exactly what record I need to create/update with my dom
It took a while for the domain mapping to go live once I've updated the record.
![Processing mapping...](/images/posts-2021/08/20210820_domain_mapping.png)
#### Final tests
### Final tests
Once it did finally update, I was able to hit `https://go.bowdre.net` to get the error/landing page, complete with a valid SSL cert:
![Successful error!](/images/posts-2021/08/20210820_landing_page.png)
And testing [go.bowdre.net/ghia](https://go.bowdre.net/ghia) works as well!
#### Outro
### Outro
I'm very pleased with how this quick little project turned out. Managing my shortened links with a Google Sheet is quite convenient, and I really like the complete lack of tracking or analytics. Plus I'm a sucker for an excuse to use a cloud technology I haven't played a lot with yet.
And now I can hand out handy-dandy short links!