2020-06-23

Ward Cunningham, the creator of the concept of technical debt, explained by himself.

This translation post is published under CC BY 3.0 based on the original article.

"Technical Debt" has been a hot topic in the world of system development, and is often on fire.

The creator of the concept of technical debt is Ward Cunningham. In 1992, he compared the first release of the code to debt in the Experience Report of the object-oriented international conference OOPSLA '92 ("Shipping first time code is like going into debt").

Ward Cunningham has made many contributions to the software world. He was the inventor of the Wiki, a mentor of XP and TDD's father Kent Beck, who imported the "pattern language" of the architectural world into the software with Kent Beck, also the one of the member of "Manifesto for Agile Software Development" (for more information, see "パターン、Wiki、XP ~時を超えた創造の原則" or blog articles "君はWard Cunninghamを知っているか？前篇、後篇").

So, what was behind the idea that Ward invented the concept of technical debt? What did he think and why did he come up with this concept? In fact, Ward posted a little less than a five-minute video on YouTube in 2009 explaining the background of technical debt.

www.youtube.com

And the transcript of this video is posted on wiki.c2.com (that Ward himself first world developed Wiki), I have translated it as follows.

[Translation] Debt metaphor.

(The wiki translation follows in the original article.)

After the translation

What's surprising is that what Ward Cunningham is saying is quite different from the "technical debt" we envision.

The image we have in the term "technical debt" might be "crudely written code that prioritizes the release, and is left over without a clean rewrite" or "technical foundation (language, infrastructure or framework) that has become old". Or, Wikipedia in Japanese article states that "The Consequences of Haphazard Software Architecture and Unprepared Software Development". But these come from misunderstanding, Ward says.

According to Ward, the negative impact of debt is the productivity loss caused by the disconnect between the understanding gained with development and the system you are facing, not the maintainability (or clutter) of the code you're writing. Rather, he says, you should always do the best you can at the time when you're writing code.

For Ward, the debt repayment method is refactoring, and the purpose of refactoring is to eliminate the gap between their domain knowledge and the current program. In other words, this refactoring can be said to be something like “Refactoring toward deeper insight” in domain-driven design (DDD). The theme for domain-driven design is "to tackle the core complexity of software", as you know. Of course, the scope of resolving deviations is not only limited to refactoring, but also includes rearchitecting, I think.

The word "debt" tends to be more positive (as in capital) for those closer to management, and more negative (as in loans) for those closer to technology. The debt metaphor Ward is talking about is rather positive. The development approach of releasing software quickly and repeatedly, and learning from experience and hypothesis testing, is becoming more and more common today. In addition, Ward's use of the debt metaphor came about because he and the person he was describing happened to be developing the same financial software. But then, the strong word "debt" may have walked alone and created the current impression of technical debt.

By the way, the WyCash refactoring that led Ward to create the debt metaphor was a strong inspiration to Kent Beck and would appear in his later work, Test-Driven Development (Ward is the protagonist of the Introduction of the book, and the theme of Part 1, "Multilateral Currencies," is based on WyCash's domain objects).

And, Ward consistently says only "Debt" without "Technical". Now, I'm also curious who and why added "Technical", but I would like to quickly bring the translation to the world first.

2019-12-13

メンテナンスウィンドウを使わない

6年ほど無停止のサービスを運用してきた私の経験からすると、メンテナンスウィンドウ、つまり計画的メンテナンスに対するアラート発砲を抑制する機能は、使わないほうがうまくいく。仕事の中でも度々メンテナンスウィンドウの話題が出てきたので、個人の見解としてまとめてみたい。

計画的メンテナンスの手順

対外的に無停止だとしても、内部的には停止を伴うメンテナンスをすることがある。たとえば、MySQLを止めることはたまにある。まずは、どのようにメンテナンスを進めていくのかを整理しよう。

内部的な停止を伴うメンテナンスの際は作業に必要な時間とともに、アラートが起こる範囲を予測し、予告しておく。予告の範囲を決めるのは単純で、アラートが届くだろうチャンネルにお知らせしておけばいい。以前のチームではメールとSlackチャンネルを使っていたので、そこに書いていた。準備はこれでいい。

メンテナンス作業が始まる（たとえばMySQLをフェイルオーバーさせる）と、何らかのアラートが出る。メンテナンスチームはAcknowledgeを送るのも良い。予告通りであれば、アラートを受けた人は作業を気にしつつ、作業がうまく進むことを期待するだろう。作業が終わり、サービスが動き出し、アラートが緑に変わる。平和な世界が戻る。

普通でしょ？

意図的にアラートを出す利点

予めアラートを抑制する手順と比較して、いくつかの利点がある。

アラートが適切に設定されていることを確認できる

内部サービス停止のアラートは滅多に起こるものではない。一度も発砲されていないアラートは適切に設定されていることを確認する手段はない。設定画面を注意深く見るくらい？人間の目は信用しないほうがいい。停止する範囲から期待されるアラートを洗い出して、実際に発砲されることを確かめよう。

メンテナンス終了がすぐに伝わる

予想していたよりも早く作業が終わった場合、アラートが収まることで必要な通知をすることができる。「メンテナンスは無事終わりました！」っていうお知らせを急がなくて良くなる。

手間が減る

メンテナンスウィンドウを適切に設定する手間を惜しむのは悪いことじゃない。俺たちは忙しいんだ！

まとめ

意図的にアラートを出す計画的メンテナンスの手法と利点について書いてみた。

もちろん、メンテナンスウィンドウを欲しているチームはあるだろう。その場合、ここで紹介した手法が適用できないか少し考えてほしい。適用が難しいかもしれない事情には興味があるので、ぜひ共有してほしい。

この話はもしかして、アラートの所有者は誰なのかという議論にもなるかもしれない。

2019-12-10

New Relic LogsでCustom Attributeを追加する

こんにちは。 New Relic Advent Calendar 2019 - Qiita, 12/10の記事を始めます。

New Relic Logsは、Logを扱うサービスです。fluetndなどのログコレクターからログを突っ込みつつ、クエリしたり、アラートを設定できたりするやつです。

f:id:katzchang:20191210124422p:plain — ogs!

突っ込んだデータはInsightsでもクエリ可能なので、つまり、色々できそう。ということで、どこまで遊べそうかちょっと試してみます。

hostname, service_name

ドキュメントのとおりにfluentdで設定すると、とりあえずログが収集されはじめます。気づくと思いますが、ちょっと空欄のところがありますよね。

f:id:katzchang:20191210124933p:plain — HOSTNAME, SERVICE_NAMEがさみしい

とりあえず fluentd filterで足してみます

<filter **>
  @type record_transformer
  <record>
    hostname ${hostname}
    service_name "オーサムログ🚀"
  </record>
</filter>

fluentdを再起動して、ログをおくってみると…

f:id:katzchang:20191210130439p:plain — 🚀

送られました！左側のHostnameフィルタなんかもいい感じに動きそうです。

更に項目を追加してみる

Insightsに入るということはつまり、カスタム属性を追加できるようなきがしますよね。やってみました。

how_awesome という項目を足して…

<filter **>
  @type record_transformer
  <record>
    hostname ${hostname}
    service_name "オーサムログ🚀"
    how_awesome "⭐⭐⭐⭐⭐"
  </record>
</filter>

fluetndを再起動、ログをおくってみると…

f:id:katzchang:20191210130725p:plain — ⭐5つ！（3つしか表示されないが）

ご覧のとおり、Insightsからクエリ可能な感じになりました。Logsの画面上ではカスタム属性は表示されませんが、検索条件は補完されたりします。

f:id:katzchang:20191210130928p:plain — how_...

ということで、

今回のサンプルコードは↓のリポジトリにおいてあります。

github.com

ということで、New Relic Logsを簡単に動かしてみました。APMと組み合わせてつかえるLogs in Context もパブリックベータで公開中で、こちらも近々試してみようと思ってます。気になる方はご連絡ください！

2019-12-05

GitHub Actionsでデプロイを記録する

こんにちは、@katzchangです。これは New Relic Advent Calendar 2019 - Qiita、12/6の記事です。みなさん、今日もデプロイしていますか？

デプロイが行われるということは、何らかの意図した挙動の変化を期待してるはずです。新しい機能の追加、パフォーマンスの改善など。そしてご存知の通り、アプリケーションのデプロイにはリスクが伴います。だいたいのトラブルはアプリケーションのデプロイが原因だったりします。私はこれを、デプロイの「作者の意図」と「副作用」と呼んでたりします（まれに良い副作用もある）。

つまり、モニタリング上の挙動の変化とデプロイは、関連が深いのです。

という背景がありつつ、New Relic APMではデプロイを記録することができるようになっているので、その紹介をさせていただこうかと思っております！せっかくなので先月から正式版となったGitHub Actionsで動かしてみるところまでやってみます。

デプロイを記録する

デプロイの記録は各言語のエージェントで用意しているAPI経由でもできますが、ここは単純にPOSTリクエストを送る方法をとってみましょう。ドキュメントを開くと、 curl コマンドの例が見えます。なるほど、できそう。

f:id:katzchang:20191205204449p:plain — おなじみのコマンド

これをMakefileに書き下すと例えばこんな感じになります。ほぼそのままです。

deploy-create:
    curl -X POST 'https://api.newrelic.com/v2/applications/$(application_id)/deployments.json' \
         -H 'X-Api-Key:$(NEW_RELIC_REST_API_KEY)' -i \
         -H 'Content-Type: application/json' \
         -d '{"deployment":{"revision":"$(revision)","changelog":"$(changelog)","description":"$(description)","user": "$(user)"}}'

これでもう、終わったようなものですね。

REST API Keyを取ってくる

上のドキュメントからたどると、REST API Keyを取ってくるためのドキュメントにたどり着きます。なるほど、アカウント管理からたどればいいのか。

f:id:katzchang:20191205204416p:plain — 中途半端に下の方にある

生成したAPIキーは、環境変数に設定するのがとりあえずよいと思います。zshを使っていれば、 ~/.zshenv に書いていきます

export NEW_RELIC_REST_API_KEY=NRRA-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

シェルを起動しなおしたりすると、環境変数が設定された状態になるはずです。

application_idを取ってくる

APMのapplication idは、URLを見るのが一番はやいでしょう。

f:id:katzchang:20191205204335p:plain — APMのURL

applications の次の数字（ここでは 475666074 ）がapplication idです。

リクエストを組み立てよう

あとのパラメータはとりあえず適当にやってみます。まずは動かしてみることが大事です。

revision:=$(shell git rev-parse HEAD)
description:=$(shell git log -1 --oneline)
application_id:=475666074
user=$(shell git log -1 --pretty=format:'%an')
deploy-create:
    curl -X POST 'https://api.newrelic.com/v2/applications/$(application_id)/deployments.json' \
         -H 'X-Api-Key:$(NEW_RELIC_REST_API_KEY)' -i \
         -H 'Content-Type: application/json' \
         -d '{"deployment":{"revision":"$(revision)","changelog":"$(changelog)","description":"$(description)","user": "$(user)"}}'

動かしてみよう

叩いてみます。

make deploy-create

curl -X POST 'https://api.newrelic.com/v2/applications/475666074/deployments.json' \
         -H 'X-Api-Key:NRRA-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx' -i \
         -H 'Content-Type: application/json' \
         -d '{"deployment":{"revision":"25ccfa8ad1427caa9c4895ed4c6a72aad6f27c90","changelog":"","description":"25ccfa8 fix...","user": "Kazunori Otani"}}'
HTTP/1.1 201 Created
Proxied-By: Service Gateway
Content-Security-Policy: frame-ancestors *.newrelic.com
Cache-Control: max-age=0, private, must-revalidate
Content-Length: 293
Content-Type: application/json
Date: Thu, 05 Dec 2019 11:27:15 GMT
Etag: "1cbcc8f7b8f9124ba6fb97d66e67c1e6"
Server: nginx
Status: 201 Created
X-Rack-Cache: invalidate, pass
X-Request-Id: 62377002304bcbcbd1f584335e347804
X-Runtime: 0.294061
X-Ua-Compatible: IE=Edge,chrome=1

{"deployment":{"id":47717396,"revision":"25ccfa8ad1427caa9c4895ed4c6a72aad6f27c90","changelog":"","description":"25ccfa8 fix...","user":"Kazunori Otani","timestamp":"2019-12-05T03:27:15-08:00","links":{"application":475666074}},"links":{"deployment.agent":"/v2/applications/{application_id}"}}%

HTTP/1.1 201 Created とあるので、成功したっぽいです。APMのOverviewを開いて、デプロイが記録された様子を見てみましょう。

f:id:katzchang:20191205204240p:plain — オーサムなデプロイ

完璧です。

GitHub Actionsを使ってみる

GitHub Actionsは先月から正式版になった、GitHub謹製のサービスです。Gitの動作にフックして、いろんな挙動をもたせることができます。

ここではとりあえず、masterへのpushを以てデプロイを記録することにしましょう。

name: New Relic 

on:
  push:
    branches:
    - master

jobs:
  build:

    runs-on: ubuntu-latest

    steps:
    - uses: actions/checkout@v1
    - name: test
      run: make test
    - name: deploy create
      env:
        NEW_RELIC_REST_API_KEY: ${{ secrets.NEW_RELIC_REST_API_KEY }}
      run: make deploy-create

APIキーは隠したいので、Secretsに入れることにします。リポジトリのSettingsタブから、設定項目を探してみましょう。

f:id:katzchang:20191205205201p:plain — 秘密の情報

これで、 push master したり merge master すると、デプロイマーカーが記録されることになります。リポジトリのActionsタブを開くと、アクションの結果を見ることができます。

f:id:katzchang:20191205205510p:plain — 苦悩の歴史が垣間見れます

GitHub Actionsに直接curlコマンドを書くこともできなくない。が、Makefileを用意することで、他のCIサービスで動かしたり、サーバ内のスクリプトで動かしたり、もしくは自分のMacBookから動かしたりしても、一貫したメッセージが記録されます。つまり、可搬性がよい。

いつ記録するか？それが問題だ

今回はmasterリポジトリへのpushで、デプロイとして記録することにしました。デプロイブランチを運用しているならそのブランチの変更を検知してもいいし、タグをつけてるならそれでもいいかもしれない。EC2などのVMやコンテナが複数ある場合が多い昨今なので、いつの時点が本当のデプロイと呼べるのかよくわからないときありますよね。まあでもいいんです、そんなことは。とりあえず便利な情報が出るようにしましょう。

はい、Deploymentタブから、デプロイの記録がみれるようになります。