Transforming IT Operations: A Survey of Effective Practices
Transforming IT Operations: A Survey of Effective Practices Shawn Winnington-Ball Information Systems And Technology 03 December 2013 Introduction There are some fundamental problems in IT that people are working to solve
Lets examine some of the ideas and approaches Knowledge gleaned from various sources: books, blogs, articles A selection of what I find compelling IT/business IT is a critical function in the achievement of business goals Business goals have become IT goals
Past the point of no return where we can fallback to manual processes IT risk is therefore business risk IT/business In our digital society, there is tremendous value in using IT to create novel ways of enhancing our experiences Digitalization (Gartner) Business success is tied to IT success,
and how creatively and capably the IT hammer can be wielded IT risks What are some of the risks that might prevent us from achieving our IT goals? Theres too much work to do already Fixed culture: weve always done it this way Sufficiently resilient/secure IT infrastructure Silo mentality: the right people arent talking about the right things
Insufficient understanding of true priorities The approaches From here on out, IT operations context The Phoenix Project: IT is in the toilet, and the miraculous recovery The DevOps movement: bury the hatchet IT process improvement efforts, culture change IT is a mess
The situation: too much to do, everything chaotic, messy, unordered Where do you begin when overwhelmed? Tough to build a house with a jumbled pile of bricks, lumber, screws and shingles The right work isnt getting done: inefficient practices and processes Unclogging the pipes Analyze active work, see the big picture Who spends the time on this work?
Which of the work is repeatable? Which of it requires specialized knowledge? What are the organizations true priorities and how does the work fit with them? Is there a disconnect? Unclogging the pipes Collect the work, categorize it Projects, Infrastructure, Changes, Unplanned Infrastructure development/maintenance work is internal project work: call as it much
20000 view: what are All The Things currently underway? This is our Work in Progress, active tasks Unclogging the pipes Clear the backlog: what is preventing the work from getting done? Constraints and bottlenecks Systematically clear them Low-hanging fruit: cease unplanned work
Underlying causes: why does IT break? Unclogging the pipes Steady ongoing changes, make them less prone to causing unplanned work Technical debt, taking shortcuts now will cause pain later Control the release of work into IT Demand outstrips capacity: dont autoaccept new commitments Unclogging the pipes
Determine total IT capacity. What commitments can we reasonably take on? Isolate key projects and freeze ongoing efforts for everything else Identify the work that only one person does and standardize it, document the process Elevate preventative work: if it breaks often it gets the most attention Unclogging the pipes Setting the tempo by our constraints
Say NO now but say YES later once the backlog is clear Its easy to be honest about your capabilities when you have a clear picture Free and clear What can these ideas bring about? Reduction in chaos Ordered approach to work, priorities-based No more uncontrolled change Honest assessment of true capabilities
DevOps overview What is DevOps? A collaborative approach to how IT development and operations relate Tension between creating and maintaining Development: fast, agile, creative Production: stable, predictable, resilient Reconciling different perspectives
DevOps overview Borne from the Agile development movement: fast code release, quick sprints Speed is of the essence: companies need to keep up with competition, provide value quicker and more often, more reliably The DevOps philosophy is summed up in three guiding principles DevOps First Way 1. Systems Thinking
Performance of the entire system Fast flow of work: continuous integration, deployment: small legos not big bricks Understand that value is generated in IT from left to right: development to production, always moving forward Reduce friction, increase velocity (Farr) DevOps Second Way 2. Amplify feedback loops Bring developers closer to their live code: if
sysadmin is on-call, why not the developer Improve the duration between learning of and correcting failures When the system is broken, fix it before completing the work itself DevOps Third Way 3. Culture of continual improvement and learning Take risks, fail quickly, move on Prevent failures from reaching production
The basis of improvement is practice and repetition: make it habitual and widespread Test your supposed resilience: break things on purpose to see what happens DevOps: the toys Infrastructure as code: heavy use of configuration management Versioned environments, automated deployments Graph anything and everything
DevOps isnt tools but they are invaluable to establishing the culture The Visible Ops Prescriptive guide based on ITIL ITIL doesnt tell you where to begin; daunting effort Authors provide 4 distinct phases of process improvement Case study based: what do the shining stars have in common?
The Visible Ops 80% of outages caused by operator and application errors Cultural problems Change management is made too tough Cowboy culture; misplaced sense of agility Reactive, always firefighting, never planning Constantly chasing audit requirements The Visible Ops
Characteristics of high-performing orgs High availability as measured by MTBF and MTTR High throughput of successful changes
Investment early in IT lifecycle: release mgmt Visible audit controls IT ops and security working closely, mentor/mentee Low amounts of unplanned work Server to admin ratio > 100:1 The Visible Ops Stabilize the patient
Identify most problematic infrastructure Publish change policy: Thou Shalt Not Touch Create designated change windows Use Tripwire to verify compliance Create Change Advisory Board body comprising stakeholders, use change request tracking system Initiate change management meetings (to authz changes) and daily change briefings (to announce)
The Visible Ops Catch and Release & Find Fragile Artifacts Interrogate all systems, ask many questions of them Find the systems that are unique, scary, important, and historically problematic Determine how many unique configurations you actually have Document systems and services and interdependencies in a CMDB
The Visible Ops Create a Repeatable Build Library Infrastructure as fuses; replace, dont fix Engineer builds for fragile infrastructure Reduce unique configurations in production Create Golden Builds: system images Identify lowest common denominators across the environment The Visible Ops
Continual Improvement Metrics: cant manage what you cant measure Fact-, not belief-based management MTTR and MTBF are key, affected by release stage planning efforts Closed loop between phases 1-3 Release, controls, resolution LISA 2011 SREs at Google: Tom Limoncelli
Disconnect between dev and prod, competition brings them closer out of necessity Faster feature release, pent-up waterfall methods no longer suffice Dev teams run their own services for 6+ months SREs provide self-service to devs: systems, storage, bandwidth, monitoring, docs: videos, wikis, SLA metrics LISA 2011 Deployinator at Etsy: Erik Kastner, John
Goulah Speed and agility valued: 30+ code deploys/day Be wrong as fast as possible Graph everything that can be measured The entire company is on IRC, up to CEO
Code push announcements are published via IRC bot LISA 2011 Puppet: Luke Kanies A pep talk for an obstinate, slow-moving sector Competition drives innovation: do it better and faster than the next person Zynga was adding 1000 servers per week (!) Cloud computing is independence and self-service, not doing it all yourself, relying on sub-contractors
LISA 2011 Game Day: Jesse Robbins, Opscode Things happen, adjust your response to them Determine the MTTR on your own terms Rules: Preparation: goals: mitigate impact, reduce MTTR, MTBF Participation: all hands on deck, everyone suffers together Exercises: trigger and expose latent defects, start small Work up to full data centre outage! Essentially positive outlook, can-do attitude
IT culture Tools, tools, tools is the typical mantra Discuss the ideas, habits and beliefs that underpin our approach to our jobs and IT Technology is rapid, people arent Give People priority. If a few more projects spent a third or more of their time, effort and money on People aspects (consultation, collaboration, walkthroughs, training, pilots, training, coaching, training, support, feedback...) instead of Technology and ITIL consultants, we might have some more successful ITSM
implementations. (Rob England, itskeptic.org) IT culture How do you compel people to change their views and habits? Address how is this time any different? Address how does this affect me? and what do I stand to gain from it? Courage to tell it like it is: be honest and dont avoid conflict out of fear Be vulnerable, share your personal story
Conclusions Many great ideas on how to advance IT operations to meet business goals Perhaps we just need ideas to flourish in small pockets? Cant ordain cultural change: find places where it will grow and support the good ideas Organize more places to connect like-minded people
The Visible Ops, The Phoenix Project Behr, Kim, Spafford http://itskeptic.org (ITSM consultant, kinda grouchy, great critical perspective) http://blogs.pinkelephant.com/troy (ITSM consultant, several years of blog material) [email protected] Limoncelli Opscode Gameday LISA 2011 http://agile.dzone.com/articles/agile-its-second-decade-0
Color Theory Project. Create 2 images: color wheel and monochromatic . Can be abstract . Illustrate a narrative . No white space. 2 wks to complete . All orginal artwork . Examples. Example. Exit Ticket . What progress have you...
nine. Answer: B. four. The so-called Rule of Four is mentioned only briefly in the text, but it is a good example of the way in which Supreme Court justices behave as rational political actors seeking to pursue their own...
If you observe the flower closely, you find that it is made of individual arms (we call it petals) and each petal is identical except for the colour. You can also see a pattern in which the petals are arranged...
) Drop the original tablespace Do not use 'AND DATAFILES' Oracle recommends stronger methods for OS level operations, see 5 Use 'shred' or other OS commands to delete the old data file Reduces chance of finding ghost copies of the...
Thesis Writing. Add to your notes… Your thesis statement should be specific—it should cover only what you will discuss in your paper and should be supported with specific evidence. ... Staple your outline and rough draft together and turn in...
Urbanization is a measure of development (abundant AG tech, people leave rural areas for cities for work and services) MDCs - @75% urban. LDCs - @40 % urban. BUT LDCs contain most of world's largest cities (6 of top 10...