SME_ITDR: Start of Day and End of Day as known recovery points


Had a discussion the other day about how should an application recover.

It’s obvious that real-time replication of data, databases, and “stuff” is NOT the same as a restart after a recovery.

(Well, maybe not so obvious to the techies who love these things.)

I made a case to a group of IT folks that an “application”, or suite of applications, or a portfolio of applications had to accomplish certain functions:

  • Determine the state of the world when it starts
  • Verify the data and files being presented are “correct” (i.e., IAAA checked)
  • Align all its data and files to the “correct” starting point
  • Run any transaction logs necessary to bring the app up to speed (i.e., recovery point runs up to current time or last good transaction time)
  • Allow for correction and catchup to external suorces
  • Preserve trails to demonstrate “correct” recovery

If the “application” can NOT do all of these things then it is doomed to a completely manual recovery.

When an enterprise has thousands of “applications” — usually intertwined in a rat’s nest of complexity — the likelyhood of a timely successful recovery is directly proportional to the enterprise’s “luck”. (More likely to win the Lotto.)

And, it’s interesting what Leadership, Regulators, External Auditors, Internal Auditors, and Risk Managers will accept as “proof” this will all work when needed.

Glad it’s not my paycheck on the “pass line” at this particular crap shoot.

# – # – # – # – #  

SME_ITDR: Two weeks? Most business can’t recover at all!

New Standard: Two-Week Disaster Preparedness
What message are you telling people about disaster preparedness?
Eric Holdeman | March 31, 2015

*** begin quote ***

Three-day or 72-hour disaster preparedness messages have dominated the national message for decades when it comes to how long you should tell people to be prepared for disasters.

My thinking on this started to change in 2005 following Hurricane Katrina. We were about to launch a big public education campaign in King County, Wash., called “Three Days, Three Ways.” The three-way message was have a plan, build a kit and get training. At the time, when I checked with American Red Cross and Federal Emergency Management Agency about the possibility of that message changing, they said “no” so we went ahead with the campaign so as to standardize and not confuse people with different messages.

Now 10 years hence, Hurricane Sandy was another learning lesson and the great quake that could happen any day still is looming in our future. Many emergency management agencies in locations where you can have a huge regional disaster have moved on to telling their communities to become prepared for a week.

*** end quote ***

From an ITDR pov, most business can’t suffer ANY interruption.

They can’t realign their data after a disaster.

Might as well close and start over.

If the recovery has not been designed in from the start — like start of day / end of day recovery points with automatic transaction capture to allow high speed replay — then you can forget that recovery and restart.


# – # – # – # – # 

LEAD_PPP: How to manage complexity?

How do you overcome complexity when leading resources that span across multiple continents?

Simplify! Keep things “small”! Establish firm boundaries — time, resources, and deliverables!

No HUGE projects that 99% done until they are not done at all. No excuse for “overruns”; violate the boundary and we need to go back to the drawing board.

Conduct proofs of concept, pilots and lots of testing, from end user testing to suitability testing.

Make the Users the arbiter of “success”!

When the tests yield positive results, roll out a project on a small scale, not enterprise-wide.

Provide overwhelming support so problems are quickly solved and don’t generate bad press.

Keep the lines of communication open and ensure processes remain within the limits of time frames, budget money, function points, requirements and sign-offs.

Whether it s a Program, Project, or Process, they are all served well by this approach.

— 30 —


Vandalism in Arizona Shut Down Internet, Cellphone, Telephone Service Across State
Incident raises concerns a domestic or international terrorist could tamper with U.S. infrastructure
BY: Adam Kredo  February 27, 2015 5:00 am

*** begin quote ***

Cellphone, Internet, and telephone services across half of Arizona went dark on Wednesday after vandals sliced a sensitive fiber optic cable, according to those familiar with the situation. The incident is raising concerns about the safety of U.S. infrastructure.

The outage shut down critical services across large parts of the state, preventing individuals from using their phones, bank and ATM cards, and the Internet. Critical services, such as police and state government databases, as well as banks and hospitals, also were affected as a result of the vandalism.

The services first went dead around noon MST on Wednesday, causing complete service interruptions across half the state, from Phoenix to such northern cities as Sedona, Prescott, and Cotton Wood, according to an official from CenturyLink, the Louisiana-based communications company that owns the severed line.

*** end quote ***

Interesting point here — (1) Why? (2) Impact? (3) A probe?

And, of course, the implications for architects and engineers.

The question one immediately asks is where is the diverse duplicate route?

Sounds like penny-wise and pound-foolish network design.

And, if I am paying my ISP as a business for a diverse route, then do I get a refund since it’s obvious I didn’t get what I was paying for.

How much did those businesses impacted lose with the cut?

And, just in case all those folks not impacted think they have nothing to be concerned about, perhaps this is your wake up call to drag out your network diagrams and think “what if?”

— 30 —

SME_ITDR: IN ITDR, there are no “silver bullets”


Disaster recovery as a service wipes out traditional DR plans
by: Paul Korzeniowski

Disaster recovery planning and infrastructure builds vex IT managers. Cloud services offer lower costs and more flexibility, but not without risk.

*** begin quote ***

How to construct a DR plan

First, outline potential disasters for the data center: Hazardous weather, power outages, vendors’ systems going offline, employee sabotage or outsider attacks are all possibilities.

Identify which of its hundreds of applications the corporation needs online immediately. Audit the list and prioritize by importance to daily operations.

Next, source and install redundant data center infrastructure — servers, software, network connections, storage — to support the applications. Disaster recovery plans cannot escape cost considerations; an offsite data center is expensive.

*** end quote ***

I would assert that this is EXACTLY how NOT to construct an ITDR plan.

I’d also assert that “the Cloud”, and “DRaaS” (Disaster Recovery as a Service), is not the “Silver Bullet”. (With apologies to Coors Light)

In the old mainframe days, professionals recognized the — what I call — the partial recovery sequence. IT hardware is too expensive to duplicate, so let’s  triage.

Since getting the tapes from Iron Mountain and going to Sungard or Comdisco took time, ITDR started with Business Continuity Planning (BCP).

And, BCP required Business Process Reengineering (i.e., what will the Business do until IT recovers and what brings “money in the door” — note we don’t care about “out the door”, they can wait.

Early in my career, I noted an interesting behavior. I call it “Everything is critical UNTIL I have to pay for it! Then, nothing is.”

May sound funny, but the minute IT starts doing DR, it’s like money is no object.

Here’s an interesting experience I had at a large Financial Institution that shall remain nameless.

The Business Units said: “I can’t afford any down time, Nor can I afford any data loss when I do take down time.” (Now that in and of itself it an interesting requirements statement, but this is about SME_ITDR; not situation appraisal.) No problem. Synchronous data replication to a bunker near to the production data center, sync rep to a bunker near the recovery center, sync rep from the far bunker to the recovery center. Never lose any data ever. Price tag is 20M$ for about the first 5 applications; incremental after that in discrete chunks. Where’s the checkbook Senior Business Unit Head Honcho?

The response was “what can get for free” (i.e., what level of service costs zero)???

Easy answer: TANSTAAFL!! (“There Ain’t No Such Thing As A Free Lunch” From Robert Heinlein’s classic) 

Clearly, IT can NOT do ITDR in the absence of the Business — both from a cost and a process point of view.

The best that IT can do alone, it to keep “cutting the homogeneous datacenter” into smaller and smaller discrete modules of service (i.e., like the data center is the motherboard and everything “plugs in” by discrete well-defined interfaces.

At that previously mentioned large Financial Institution that shall remain nameless, the application portfolio had about 700 applications, an analysis of their Remedy data showed that, to recover any SINGLE application, one was required to recover about ⅔ of the portfolio. 

And, needless to say, that’s not happening any time soon.

Bottom line: One must design Business Process that are recoverable; then the technology can be recoverable. Translated into IT-ese, start with the cart; not the horse.

— 30 —

STRATEGIC_INFORMATIONTECHNOLOGY: Solving Increased Data Backup and Recovery


*** begin quote ***

Live Webcast: Solving Increased Data Backup and Recovery

Date: Wednesday, March 11, 2015
Time: 10 am Pacific Time / 1 pm Eastern Time
Presented by: Michael Krutikov, Sr. Product Marketing Manager

With ever–increasing amounts of data, whether driven by datacenter evolution or just plain growth, there is a definite need for better solutions in today’s enterprise data centers. So the big question is, how do you solve for the increased amount of data while obtaining operational efficiency that delivers success for IT and ROI for an organization?

*** end quote ***

Interesting concept focusing on the data in the datacenter.

But what good does it do to have your data if your recovered systems are a mess. 

That is, for example, what about:

  • The various job schedulers —autosys, job track, cron, tape management systems, job control language, job instructional language, batch, pseudo batch, etc. etc. — all have to be “resynchronized”.
  • The various third parties that one receives and sends to. In a disaster, one assumes that life elsewhere went on. Perhaps, even complicated, by your own organization, using contingent methods to up date data that the recovering systems are unaware of.
  • What about the “appliances” and “firewalls” — of various and sundry types — that keep state information in strange places.

In several previous employment “lives”, the solution was found in “Start of Day / End of Day” backups.

What this does is established known good points where the datacenter can restart from with the sure and certain knowledge that ALL business and technology data is good and consistent across the enterprise. If, and this is a big if, the applications and systems can speed the clock from “the recovery point” to the “interruption point”, then the Business and Technology people can pick up right where they left off. Every “Third Party Involved” interface needs to have a “reconciliation procedure” to align the recovered data with that held in the Third Parties systems.

It’s rare to see a recovery that can happen this way. (I’ve never seen it. Despite advocating for it with many different Clients and Employers.)

It’s as if they can’t imagine a disaster, and as such prefer to gamble: (1) that it will never happen; and (2) somehow someway they will muddle through. With that as their “strategy”, they do just enough to fool the auditors and their Leadership. Of course, it all comes tumbling down when “hard questions” are asked.

— 30 —

FIGURING: “Know Thyself!”

9/04/2013 @ 9:26AM 55,904 viewsNeed A Career Tuneup? Gallup’s Tom Rath Has A Quiz For You

*** begin quote ***

Are you a learner, an achiever or an includer? If you’ve seen those terms before, you’re probably one of the nine million people who has taken Gallup Inc.’s StrengthsFinder test. The workplace diagnostic quiz is a favorite at companies ranging from Facebook to Harley-Davidson. And it’s become a financial goldmine for Gallup, generating more than $100 million of revenue to date.

These are challenging times overall at Gallup, the opinion-research and business-consulting firm, as I explain in a major Forbes magazine story this month. But the company’s StrengthsFinder franchise keeps on humming. Prime evidence: the unstoppable appeal of “StrengthsFinder 2.0,” a book by Gallup executive Tom Rath. He book explains the test, offers some coaching and provides a security key that allows one reader per book to take the quiz online.

*** end quote ***

2015-Feb-23 Strengths Insight Report

# – # – #

Seems like a great idea for all those who are interested in understanding themselves.

“Know Thyself!” – ascribed to Socrates

— 30 —


LEADERSHIP_PROJECT: The difference between a “Book of Work” and a “Workbook”


15 Great Ways Project Management Can Help Your Growing Business
01/17/2014 Written by: Ian Needs

*** begin quote ***

Many SME’s are simply scared of the term “Project Management” or end up implementing a host of non-connected, counter-productive tools. 

*** end quote ***

Recently, I was involved with an application portfolio, “Book of Work”, that was so tightly integrated, that a full 75% of the portfolio was required to run any one application in the portfolio.

And, humorously, no one seemed upset about that. And, consequently, no one did anything about it.

Old Wall adage: “When in a hole, stop digging.”

I did some unofficial of the “soil” from this particular “hole” and found 23 different ways that applications were allowed to become dependent up each other. Interesting, I found some direct dependencies (i.e., an app writes a file that it later reads back into itself) as well as indirect (i.e., app1 writes a file1, app2 reads file2 and writes file2, app1 requires file2 to complete its work). Amusing!

In my experience, this comes from not having having a development methodology, with policies, procedures, and processes, that will prevent “just get it done” type work.

But, that’s why I’m not in charge. 

No, you can’t have it now, if it leaves us in a bigger hole than when we started.

The journey of a thousand miles starts with the first step. Make it in the correct direction.

I find it humorous that Microsoft Project is “too expensive” for any large organization. Have they ever looked at what they “waste”?


I like my projects — “small” and bounded by “time”, “resources”, and “deliverables”. IF you can’t slice it into bite-size subproject, THEN you deserve every over run and under delivery you get.

— 30 —

BPR: Infosec risk reduced by proper engineering

New iCloud phishing campaign discovered
February 13, 2015 by Lewis Morgan

*** begin quote ***

This is a cheeky one. Cyber thieves have been caught red-handed sending out phishing emails that are designed to steal financial information.

*** end quote ***

I NEVER have this problem.

I have my own domain.

I designed my approach around the only thing constant — the email address.

Not the one in the header, which can and is forged often. But the delivery address. It’s got to be authentic otherwise how is it going to get to you.

By using your own domain, you give the BANK and email address for you of “BANK @”.

Then, anything that purports tone from the BANK, that does NOT come in on that address, is fraudulent.

Laugh. It doesn’t matter how authentic it looks, it CAN NOT come in on “their address” (i.e., the one I assigned them).

Needless to say since I can create an unlimited number of these, and they all sort by a wild card rule in a catch all mail box, it’s a trivial system to maintain.

So go ahead ne’er do wells, spam, phish, and con all you want, you can’t pretend to be my bank unless you crack the BANK and get the email address assigned to them. 

Oh, and BTW, I used “bank@“ as an example. In practice, the “address” is more complex that that. “Bank” may actually be “9B94VPp8HhEU”.

But then what do you expect from a fellow who’s Mom’s Maiden Name might be “UmuCZDBpB5FY” and who’s first car was a “xF9DxMQk8CfK”?


The sad part is that this is such a simple and easy process to implement, but, despite the number of times I have blogged it, talked about it, and demonstrated it, folks just don’t care enough to take a such simple step.

It’s all about simplicity and clarity.

— 30 —

FINANCIAL_JUSTIFICATIONS: Time, Talent, and Treasure?


Making Better Decisions Considering Time, Talent & Treasure

*** begin quote ***

Is this worth our Time, Talent, and Treasure? Below are suggested questions to ask about the project, plan, or idea…


Is it worth the time investment?
Are there better/more effective things we could be doing with our time?
Does the effort provide a worthwhile return?

*** end quote ***

I personally like: Attention, Effort, and Resources.

But it’s very similar.

I like “attention” as opposed to just “time”. 

We all all have a both a limited attention span and a limited amout of attention we can allocate.

It’s a qualitative and quantitative measure of the Leadership’s most limited resource.

If the decision maker can delegate an “above the waterline” unit of work without spending much attention on it, then that’s a big win.

But how many Leaders don’t consider that amount of “attention” that getting involved in something will take.

It’s more than just “time management”; it’s opportunity portfolio management.

Using “financial justifications” is one way to “focus on first things first”.

Rarely done before the “sunk costs” being to pile up. The financial equivalent of a “body count”.

# – # – # – # – #