<?xml version="1.0" encoding="utf-8"?>
<feed xml:lang="en-us" xmlns="http://www.w3.org/2005/Atom"><title>Simon Willison's Weblog: operations</title><link href="http://simonwillison.net/" rel="alternate"/><link href="http://simonwillison.net/tags/operations.atom" rel="self"/><id>http://simonwillison.net/</id><updated>2019-07-12T17:36:25+00:00</updated><author><name>Simon Willison</name></author><entry><title>Details of the Cloudflare outage on July 2, 2019</title><link href="https://simonwillison.net/2019/Jul/12/details-cloudflare-outage-july-2-2019/#atom-tag" rel="alternate"/><published>2019-07-12T17:36:25+00:00</published><updated>2019-07-12T17:36:25+00:00</updated><id>https://simonwillison.net/2019/Jul/12/details-cloudflare-outage-july-2-2019/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.cloudflare.com/details-of-the-cloudflare-outage-on-july-2-2019/"&gt;Details of the Cloudflare outage on July 2, 2019&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Best retrospective I’ve read in a long time. The outage was caused by a backtracking regex rule that was added to the Web Application Firewall project, which rolls out globally and skips most of Cloudflare’s regular graduar rollout process (delightfully animal themed, named DOG for the dogfooding PoP that their employees use, PIG for the Guinea Pig PoPs reserved for free customers, then Canary for the final step) so that they can deploy counter-measures to newly discovered vulnerabilities as quickly as possible—but the real value in the retro is that it provides an extremely deep insight into how Cloudflare organize, test and manage their changes. Really interesting stuff.

    &lt;p&gt;&lt;small&gt;&lt;/small&gt;Via &lt;a href="https://news.ycombinator.com/item?id=20421538"&gt;Hacker News&lt;/a&gt;&lt;/small&gt;&lt;/p&gt;


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/operations"&gt;operations&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/regular-expressions"&gt;regular-expressions&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/cloudflare"&gt;cloudflare&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/postmortem"&gt;postmortem&lt;/a&gt;&lt;/p&gt;



</summary><category term="operations"/><category term="regular-expressions"/><category term="cloudflare"/><category term="postmortem"/></entry><entry><title>The Virtues of Monitoring</title><link href="https://simonwillison.net/2011/Jan/13/monitoring/#atom-tag" rel="alternate"/><published>2011-01-13T04:26:00+00:00</published><updated>2011-01-13T04:26:00+00:00</updated><id>https://simonwillison.net/2011/Jan/13/monitoring/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://www.paperplanes.de/2011/1/5/the_virtues_of_monitoring.html"&gt;The Virtues of Monitoring&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Fantastic guide to the various levels of monitoring required for a modern web application.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/monitoring"&gt;monitoring&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/operations"&gt;operations&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sysadmin"&gt;sysadmin&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/recovered"&gt;recovered&lt;/a&gt;&lt;/p&gt;



</summary><category term="monitoring"/><category term="operations"/><category term="sysadmin"/><category term="recovered"/></entry><entry><title>Quoting Theo Schlossnagle</title><link href="https://simonwillison.net/2010/Mar/24/operations/#atom-tag" rel="alternate"/><published>2010-03-24T00:43:48+00:00</published><updated>2010-03-24T00:43:48+00:00</updated><id>https://simonwillison.net/2010/Mar/24/operations/#atom-tag</id><summary type="html">
    &lt;blockquote cite="http://omniti.com/seeds/the-cloud-is-great-stop-the-hype"&gt;&lt;p&gt;The operations team is the one place with access to data and traffic that is "real-time enough" to detect business issues before they manifest in significant monetary loss. Traffic anomalies, chargeback rates, visitor retention… all these translate into money. This is what ops does; they make things work; they make the business work. And they spend a lot more time trending, investigating and analyzing than they do replacing hard drives and network cards.&lt;/p&gt;&lt;/blockquote&gt;
&lt;p class="cite"&gt;&amp;mdash; &lt;a href="http://omniti.com/seeds/the-cloud-is-great-stop-the-hype"&gt;Theo Schlossnagle&lt;/a&gt;&lt;/p&gt;

    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/operations"&gt;operations&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/theo-schlossnagle"&gt;theo-schlossnagle&lt;/a&gt;&lt;/p&gt;



</summary><category term="operations"/><category term="theo-schlossnagle"/></entry><entry><title>Installing Django, Solr, Varnish and Supervisord with Buildout</title><link href="https://simonwillison.net/2009/Jun/7/bertrand/#atom-tag" rel="alternate"/><published>2009-06-07T13:54:44+00:00</published><updated>2009-06-07T13:54:44+00:00</updated><id>https://simonwillison.net/2009/Jun/7/bertrand/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://zebert.blogspot.com/2009/05/installing-django-solr-varnish-and.html"&gt;Installing Django, Solr, Varnish and Supervisord with Buildout&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Useful, detailed instructions... but I still think this stuff is Way Too Difficult at the moment. I’m a big fan of the idea of sites that are assembled from multiple smaller web services talking HTTP to each other, but ensuring all the moving parts stay running is massively more painful than just running Apache and MySQL.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/apache"&gt;apache&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/bertrand-mathieu"&gt;bertrand-mathieu&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/buildout"&gt;buildout&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/django"&gt;django&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/mysql"&gt;mysql&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/operations"&gt;operations&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/python"&gt;python&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/rest"&gt;rest&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/solr"&gt;solr&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/supervisord"&gt;supervisord&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/sysadmin"&gt;sysadmin&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/varnish"&gt;varnish&lt;/a&gt;&lt;/p&gt;



</summary><category term="apache"/><category term="bertrand-mathieu"/><category term="buildout"/><category term="django"/><category term="mysql"/><category term="operations"/><category term="python"/><category term="rest"/><category term="solr"/><category term="supervisord"/><category term="sysadmin"/><category term="varnish"/></entry><entry><title>Google uncloaks once-secret server</title><link href="https://simonwillison.net/2009/Apr/2/batteries/#atom-tag" rel="alternate"/><published>2009-04-02T10:47:59+00:00</published><updated>2009-04-02T10:47:59+00:00</updated><id>https://simonwillison.net/2009/Apr/2/batteries/#atom-tag</id><summary type="html">
    
&lt;p&gt;&lt;strong&gt;&lt;a href="http://news.cnet.com/8301-1001_3-10209580-92.html?part=rss&amp;amp;subj=news&amp;amp;tag=2547-1_3-0-20"&gt;Google uncloaks once-secret server&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
Instead of a data centre wide UPS and redundant power supplies, each Google server has its own 12V battery. They live in standard shipping containers, each holding 1,160 servers.


    &lt;p&gt;Tags: &lt;a href="https://simonwillison.net/tags/datacentres"&gt;datacentres&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/google"&gt;google&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/operations"&gt;operations&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/power"&gt;power&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/servers"&gt;servers&lt;/a&gt;, &lt;a href="https://simonwillison.net/tags/ups"&gt;ups&lt;/a&gt;&lt;/p&gt;



</summary><category term="datacentres"/><category term="google"/><category term="operations"/><category term="power"/><category term="servers"/><category term="ups"/></entry></feed>