Tag Archives: itil

The ITIL cynic

“What is a cynic? A man who knows the price of everything and the value of nothing.”

Oscar Wilde

Sometimes I think most people see ITIL and ITSM as this. Organisations implement the bare minimum of ITIL in order to say ‘We follow ITIL’ but ITIL is too costly to the business so some form of incidents, problems and change has been implemented. As that is ITIL right? In my view, the IT organisation does not appreciate the true value of ITIL.

If we go for a drink after work, after a hard day. I offer to get the drinks but all I have is my credit card and there is a minimum charge for credit cards at the bar, so, can you pay for the first round? I know the prices of most of the drinks so ask if I can borrow £10 and promise to buy the next drinks with some food . You give me £10 and before you can say what you would like, I take the £10 note and goto the bar. You shrug your shoulders and hope I will know what you want, but do I? The answer is clear when I come back with a pint of cider for me and a cocktails, with sparklers, a little umbrella and some much fruit it in you think a gorilla will pop out for you…no Hmmm, I think I have messed up, after looking at your face looking at your drink and the longing look you have for my pint of cider. No problem, off I go again to rectify the matter with the change (we are not in London so I do still have some change), and I come back with…..a glass of sparkling apple juice. I thought you liked apples and a fizzy drink just like my pint of cider. At this point, you go to the bar yourself and buy yourself a drink, one which you want. Would you let me buy you another round?

This is my point, in the example, I know the price of the drinks so know I will come in under budget (£10) but have I provided value to my customer, you? I think the answer is no. I have assumed I know what you want without asking what you really want. If only I had asked ‘What would you like’ then I would of achieved the budget BUT also achieve customer satisfaction and value for money?

Sometimes I hear of companies who are very proud of their ITIL structure, incidents, problems and changes. However, how often do these companies meet with their business and find out how It is perceived? Can the IT organisation do things better to provide more value? Has the IT organisation improved its value from when it was not following ITIL to after it is following ITIL to the business? Does the IT organisation know the critical success factors for different business units and how IT can help achieve these?

If the customer does not see any more value pre ITIL to post ITIL or the business still feels they are not integrated with the IT organisation, then is the IT organisation following ITIL properly? ITIL is just an ideas book to help provide value, it is not a recipe book showing you steps 1-10 on how to bake a great IT cake.

This, I think, is the true value of ITIL. IT organisations should look at the value ITIL can help provide and not just the cost. Do not be an ITIL cynic.

Thankyou for reading my post. This is my opportunity to blog about a subject I love but am still learning. These posts are my way of showing how I understand the subject, however, I would encourage you to leave comments, did you agree / disagree with the post? Did I not explain something well enough or incorrectly? Do you want me to blog about another subject within ITIL? All feedback helps me to understand more. Thankyou.

Five laws of incidents and problems

Incidents and problems are in place to restore a service, fix an issue, work out why the issue or outage happened in the first place and then try and make sure this doesn’t happen again. All teams should be working together to make sure there is minimum downtime to the business on all incidents provided the right priorities are followed. We have all seen the analogies of incidents and problems.

eg. http://www.reddit.com/r/ITIL/comments/2d1zga/how_do_you_explain_the_difference_between/

https://itilbegood.com/2014/07/28/requests-incidents-problems-and-known-errors-in-a-nutshell/

However, where it gets a bit confusing is, where does investigating an incidents root cause and resolving the service cross over into problem root cause territory. Why should an engineer set about investigating an outage have to raise a problem if they have the incident from the customer, surely this all seems like a lot of paperwork for a few clicks?

Therefore, I wanted to put a stake in the ground, after a few years doing support, and then everyone can shout me down but at the end of the discussion / bloodbath we might have a solution. Of course it does depend upon organisations but there seems to be some confusion on incidents and problems.

At the heart of the matter is this truth,

Between incidents and problems, you should be able to restore the service quickly and root cause found with the cause of the incidents being mitigated or a work around published, so future incidents can be fixed quicker. The whole purpose is to provide fixes to the business so the business operation is minimally impacted. If there is an impact, the situation should be recovered and steps to mitigate the impact or minimise it, the next time it occurs.

Ok, so lets look at two incidents, one a customer can’t access their file shares and one customer calls in and says their Citrix sessions have hung…..and two minutes later another person calls up to say their citrix session have also hung.

The first one, the support engineer would pick up the call and after some trouble shooting realise the customers password had expired, reset and reboot, the customer is up and running. The way to mitigate it is to tell the customer to reset the password before it expires. So, this process has gone through the restore of service, finding the root cause and mitigating the issue.

Next, the engineer checks the Citrix session and finds out both customers are on the same server, the engineer can not remote onto the server, therefore the server looks like it has crashed. There is a known error entry which tells the engineer to take the server out of the load balancer and reset the customers sessions, the customers will re-connect to another server so service is restored. The engineer then reboots the server and upon reboot the server looks fine. However, would you put the server back into the live environment?

These two incidents illustrate the issue, the engineer on the first call was competent to go through all the steps and complete the incident. However, is the engineer competent to go through all the steps of trouble shooting the server? Maybe not, maybe a Citrix team needs to be involved in checking out the server before the server is put back in to the production environment. This is where a problem should be raised, the incident can be closed or linked to the problem but a problem should be raised as the server needs to be checked out why it crashed but the production environment continues to function.

Law one, raising a problem comes down to the competency of the support team. Can  they restore the service, find the root cause and mitigate it in an incident or can they only restore the service and then raise a problem for a specialist team to find the root cause and mitigate the issue.

Next, time needs to be monitored on incidents. Engineers love to trouble shoot it and fix issues, trying fix after fix to get to the bottom of the issue, however, this may take an hour. However, is this good for the business? If the engineer could put in a work around for the issue in the first 5 mins and leave the customer to get on with their day but raise a problem to investigate the issue further without needing to bother the customer, then surely this is a better way of working from the business point of view?

Law two, incidents, where a work around is present this should be implemented and a problem should be raised to find the root cause at a later date. The priority is to restore the service to the business.

When to raise a problem should be a thing of governance. ITIL explains this ITIL Service Operation page 99 (service operation process – Incidents versus problems)

The rules for invoking problem management during an incident can vary and are at the discretion of individual organisations.

Therefore when to raise a problem is up to the organisation. In the examples of the Citrix server, I would suggest a problem should be raise when the impact is to many customers, a key service or server is impacted or to group incidents together to raise to 3rd party suppliers in supplier meetings, eg the support teams notice a few hard drives are failing in the first few months. These incidents could be group togeher to raise to the 3rd party supplier.

Law three, governance should write up rules on when a problem should be raised and clearly communicated to the IT organisation.

eg A problem should be raised for all Citrix server crashes and assigned to the Citrix team

Incidents should be monitored for trends and to check if a problem could be raised to mitigate recurring incidents. Monitoring the incidents can also help check if a work around could be put in place for a long running incident and problem raised to find the root cause.

Law four, all incidents should be monitored for trend analysis and time to fix to see if a problem can be raised to mitigate the underlying issue.

Finally, once the root cause is found either through incidents and problems, one of two things should happen :

– Mitigate the issue.
– Add the issue to the known error database with a workaround / fix.

Law five, all root causes should be mitigated or the fix time shortened by writing up a known error entry with a fix or work around.

I believe by following these laws engineers have scope to troubleshoot issues as they come in whilst the business operation down time is minimised.

What does everyone think?

Thankyou for reading my post. This is my opportunity to blog about a subject I love but am still learning. These posts are my way of showing how I understand the subject, however, I would encourage you to leave comments, did you agree / disagree with the post? Did I not explain something well enough or incorrectly? Do you want me to blog about another subject within ITIL? All feedback helps me to understand more. Thankyou.

Is your service desk really a help desk in a fish costume?

Fish

Let me start with some honesty. I hate calling call centres. Every time I call I seem to loose a little bit of the will to live especially when they try and all call themselves all manner of different names, my recent favourite was ‘a customer experience agent’, when all I get is passed around, all telling me someone else should be dealing with my call or they will check and give me a call back, if I had a pound every time I was told that. A customer experience centre is still a bad call centre if I still get passed around and nobody really knows what to say or do.

I have just moved house and needed to register with British Gas (UK Gas and Electric utility company), I needed to do four things on the call:

– Give them all the details, name, address, occupation etc
– Set up a direct debit/giro so every month the right amount for the bill would be taken out of my account automatically.
– Register to add points to my nectar card, its a UK points card where I can get money off my supermarket shop
– Register for Hive, an awesome new system where I can control my  heating from my phone/tablet/computer and hive knows when I am coming home and leaving and switches on or off my heating accordingly.

Hmmm, so not too much that could go wrong. Immediately the person who picked up the call and heard what I wanted, paused, stuttered and said she needed to put me through to someone else. Already I am loosing the will, getting a little frustrated that I want to give them money but they are making it so hard, maybe a little harsh, but this has normally how it starts and only gets worse.

Then another lady picked up the phone and she nailed it, names and address…done, direct debit….done, nectar card points…done, hive…errr, never done one of these before, hold while she asked someone how to do it (this is not a problem, it is a new system so I thought I was probably a first), then bang I have an engineer coming out the following week.

Sill with me, wondering what this has to do with fish and ITIL. Well, recently I saw on a forum the title ‘How to change a help desk into a Service Desk’, the person had been tasked with changing a help desk into a service desk because that is what ITIL says and you can’t be ITIL compliant with out it. I thought, that is like calling a call centre agent a customer experience agent, a name doesn’t change a thing, it is what you do to change the perception of the customer that the team has changed.

People should ask How do I change my IT organisation into an ITIL IT organisation?’ rather then changing team names and think you are done.

I have worked in many support environments, all called many different names and some supposedly within an ITIL framework. However, I would say all were a help desk once you took away the nice names. A help desk to me, was and is, a team which takes all the calls about anything, tries to fix anything and if they can not then they have to beg, and plead, with other support teams to help them as there are no support agreements internally to get assistance. If the Help Desk can not get help then they hold onto the ticket and try over a few days to resolve it. The customer thinks the help desk is a little hit and miss, one customer even once said ‘why don’t you just call yourself, desk, instead of help desk.’

How does this differ to a Service Desk? This really is a trick question as if the organisation hasn’t changed to ITIL then the Service Desk is still just a Help Desk with a new name.

Please read my earlier blog posts to get an idea of what ITIL is about  :

https://itilbegood.com/2014/04/07/what-is-itil/

https://itilbegood.com/2014/07/19/service-management-as-a-rugby-game/

If the IT organisation wants to be an ITIL organisation, they should :

Have a catalogue showing the business what services are supported and the service desk knowing how these services are supported.

The services should be backed up with a configuration database, showing how these services are configured. the aim here is to give support engineers access to the latest configuration of the service with all the components to make troubleshooting easier so the service is resumed quickly. This is not an exercise in creating a database and ticking a box, it has to be usable and up-to-date. How the information should be presented should be after speaking to the stakeholder who will use this information. The database is a read and write database not a write database that nobody reads.

OLA’s should be written to show internal IT organisation resolution times for services, what is included and who can be involved in these fix times and how changes to these services should be implemented. If it comes in a 30 page document, ask yourself, if you were the engineer who just got a call saying nobody can access their e-mails, could you:

Would you know where then OLA is held?
How should the support teams escalate the incident?
Find out who should be on a bridge call to help fix it?
Use the configuration database to troubleshoot?
What support agreements with 3rd parties are in place?
What and who should communicate to the business the issue?
If an emergency change or a normal change needs to be implemented, how and who should do this?

All while users are screaming at you to fix it, if you feel you can’t, fix the documentation to make it easier. One suggestion would be to create a share point dashboard from the 30 page OLA document which support engineers can look to for easy reference.

After the OLA is created, SLA’s can be written which the business then knows how long an incident, request or outages should take to complete.

So, now the IT organisation knows :

What services are supported
How the services are configured
How the services are supported

Next, how does the business tell you they want something or something is broken. This is done through requests and incidents. The Service Desk should categorise the requests / incidents and add a priority to them. The priority comes from the OLA. When the service is restored to normal the incident is closed.

https://itilbegood.com/category/in-a-nutshell/

Overview of requests, incidents, problems : https://itilbegood.com/2014/07/28/requests-incidents-problems-and-known-errors-in-a-nutshell/

If an incident can’t be resolved definitively so a work around can only be used or an incident has been closed but the root cause could not be found. Open a problem, this can be worked on by an individual or a team of people to find the root cause and find the fix for the incident.

Though, ITIL is all about constant improvement there should be some sort of incident and problem management to analyse the incidents and problems to see if these can be reduced or done better through additional training or better procedures.

If you want to make a change to the service, a group of people (defined in the OLA) should assess the change in a regular meeting for proposed work, impact, back out plan, timings (is this within a change window defined in the OLA) and if the business needs to be aware, either by the business being in the same change meeting or a business communication, or both. This should minimise the impact to the business for any changes to services.

Finally, make sure there are some reports showing the business how IT is doing and the value provided. Maybe a report showing the number of changes (successful / unsuccessful), incidents (closure rate/time, categories), problems (types, closure rate, resolutions), SLA (within SLA and if not, what steps have been taken to rectify this)

Now the IT organisation knows :

How to log incidents and requests.
How to investigate incidents in more depth.
How to improve / spot trends with the incidents and problems process.
Make changes to services in a controlled way.

Finally, the IT organisation should put in place some method of improvement. Can areas of IT be improved to provide better service or value to the business?

If the IT organisation can provide these support structures to help the Service Desk, without them, the Service Desk is a Help desk still

Remember

Help Desk

Thankyou for reading my post. This is my opportunity to blog about a subject I love but am still learning. These posts are my way of showing how I understand the subject, however, I would encourage you to leave comments, did you agree / disagree with the post? Did I not explain something well enough or incorrectly? Do you want me to blog about another subject within ITIL? All feedback helps me to understand more. Thankyou.

 

 

 

Do we still need ITIL?

This is a great blog post http://optimalservicemanagement.com/blog/do-we-still-need-itil/. It reminds me of WHY you do ITIL, not because the ITIL book says so on page 32 but to provide value to the business.

If you just follow ITIL blindly then you will create a mess. Engage brain and see how bits can work for you, maybe some won’t work for you or could in future and that is ok. Just look to ‘adopt and adapt’ to make your IT organisation the best value for money it can be to the business.

Thankyou for reading my post. This is my opportunity to blog about a subject I love but am still learning. These posts are my way of showing how I understand the subject, however, I would encourage you to leave comments, did you agree / disagree with the post? Did I not explain something well enough or incorrectly? Do you want me to blog about another subject within ITIL? All feedback helps me to understand more. Thankyou.

Requests, Incidents, Problems and Known Errors in a nutshell

Over the past few weeks I have noticed some talk and discussion around what incidents, problems and requests are and what are the differences between them on some of the ITIL blogs. So here is my take :

Requests

These are requests made by the customer, eg please can you install x software or please can you replace the toner on the sales printer. These types of ‘can I haves’ should be logged as a request. These are separate to incidents, as they will have different SLA’s and priorities associated to them. Installing a piece of software for one member of the sales team has a different priority than someone in the sales team can’t access the network shares.

Incidents

These are for when thing breaks or isn’t working. eg My PC won’t turn on, I can’t access any network shares or none of the print outs are coming out of the printer. These are different to requests as it normally means the customer or team cannot work or a service is degraded so they can’t work as well. The person who picks up the incident will associate a priority eg a whole office who can’t access the network might be a Priority 1 incidents and a customer who can’t print might be a priority 3 call. These priorities should be documented with an SLA associated to them so the business will know roughly how log an incident of this type will take to fix. Again, it is up to you and the business to work out these priorities and SLA’s, ITIL is just a guide. The incident can be closed when the incident is fix permanently or a work around has been put in place which restores the service back to normal.

Ahh, and this is where some will wheel out the old chestnut, is a password reset and incident or a request?

Answer

1) Why is this not automated? Plenty of tools can allow the customer re set their password themselves without needing to log a incidents/request.

2) It is up to you and how you want to define it. All you are trying to do is separate incidents (priority) over a request (sometimes, not as higher priority, as an incident), be able to produce stats on the two to show trends to help with incident and request management and reporting to the business to show how great IT are.

Problems

What happens if all that the person who picks up the incident, can do is produce a work around or doesn’t know why the fix worked or multiple customers are logging the same type of incident eg reboot the PC and the problem goes away or all that can be done to resolve the incident is produce a work around, meaning the issues still exists but there is a sticky plaster to hold everything together? Now, problems come into play. Problems are something where a virtual problem team or an individual can look into the issue deeper, hopefully finding out the root cause and a permanent fix. A problem is also something that can be taken ‘off line’. The service has been restored as the incident has been closed so the danger has past but the problem can be used to investigate over a longer period to find the real issue.

Known errors

Through your diligent problem management and investigation, the root cause is found. However, like most things in life, it is not an easy fix. The fix requires a new server, cabling or the manufacturer of the component has acknowledged there is an issue but there is no driver update so all you can do is stick with the work around. ITIL has rather cleverly thought of this scenario and known errors can be used.

e.g.

An incident was logged and a workaround took two days to come up with but the manufacturer needs to update a drives before a permanent fix can be implemented. If someone logs a similar issues, the wheel doesn’t need to be created again, a known error should of been created after the first incidents work around was found so this can be used to implement a fix/work around quickly for the second incident.

A known error and the known error database greatly reduces the fix times for subsequent and similar incidents which are awaiting permanent fixes or there are other reasons why a permanent fix can’t be implemented, so a work around is as good as it is going to get.

Hopefully, requests, incidents, problems and known errors are a little clear on what they are and what the differences are.

Thankyou for reading my post. This is my opportunity to blog about a subject I love but am still learning. These posts are my way of showing how I understand the subject, however, I would encourage you to leave comments, did you agree / disagree with the post? Did I not explain something well enough or incorrectly? Do you want me to blog about another subject within ITIL? All feedback helps me to understand more. Thankyou.

Service Management as a rugby game

ITSM rugby

I realise the game of rugby might not be the most obvious analogy which springs to mind when you think about Service Management but hear me out.

Rugby, for me, has always been a great spectator sport; I have more the physique of the ball and not the man mountains of players. I marvel at the discipline these giants display for the game and how the game does not descend into a bar room brawl with so much muscle and will to win in such a small area.

When I think about great IT customer support, it is all about the skills of the individuals and the hand over to other support teams. How skill and great hand overs to other support teams can win or lose the IT support game. IT support is always a battle between resolving the issues efficiently without taking too much time and customer frustration increasing.

Picture the field, the IT organisation vs Customer Frustration and Time. The whistle blows, it is game time!!! The ball goes into the IT service desk scrum and the incident ball comes out to the IT organisation’s support team, the first line engineer is running with the incident ball only to be put to ground by Time. Over the top comes support from the second line teams, the ball is handed over to the second line engineer seamlessly, the engineer side steps Customer Frustration with clear communication. Oh no, Time comes in, tackles the ball and is now running with it, second line support chases the ball down. Time’s lead is growing with Customer Frustration following up quickly behind but Time is skilfully tackled by second line and runs the ball back, the final ball is handed over to the third line engineer. The fastest and most experienced players on the field with lightning footwork the ball goes down for the try and the incident is resolved.

Without great hand over’s of the ball, Customer Frustration and Time would get the ball and the value for money for all the business areas, who have paid money to see the IT organisation win, isn’t seen. If the support individual cannot hand the incident ball off to each other, individual player must try and jink past the opposition to try and close the incident. This sometimes will work based on the skill of the individual support engineer and the ease of which the incident could be closed, but sometimes it will not. If the IT organisation can win with individual skill, great hand over’s and team work then the business areas sees the value of paying to come support the IT organisation.

The other thing I enjoy about rugby, and most other sports, is the analysis of all parts of play, the breakdown and repeats of every tackle, shot, space the players should of used etc.

This is the area, where the service management team comes in, the coaches. They can take apart the play; they see the 1st line engineer fumbles the ball on pick up. A work around could be designed for the present game but a problem could be created to go away and really analyse the issue to come up with a fix, maybe a grip on the some gloves or a textured ball to make it less slippery. Communication between the second and third line support teams might be poor so the ball was intercepted and needing to be won back. Encouraging better communication between the two leads to be better and more fluid play.

Various areas of improvement could be categories, like in the ITSM tools, to be later broken down into target areas eg running down the line, communication, creating space etc, which can be work upon away from the game in set areas of expertise.

The service management team can also look at the agreements between the various team members showing who is going to take the hand off ball and who is going to come and protect the ball. This goes some way to designing an OLA. An agreement between the IT organisations showing how an incident should be handled, the support timings and items covered by the agreement. This should be in a format, that in the heat of play, can be easily understood and quickly.

Documenting how set piece of play should be played. Making sure all team members know what is required and how to do something is also an important part of the Service Management team’s job.

IT service management is all about creating value for the business areas and the best customer experience. The play might not be the finish article and individuals and team might need some work, but if the IT organisation is committed to ITIL and service management, they will work at these areas, making small and large gains and improvements. Reminding why the business areas pay for their IT organisation and the value it creates.

Hopefully I have gone some way to try and convince you that rugby and IT service management are not too dissimilar after all.

Thankyou for reading my post. This is my opportunity to blog about a subject I love but am still learning. These posts are my way of showing how I understand the subject, however, I would encourage you to leave comments, did you agree / disagree with the post? Did I not explain something well enough or incorrectly? Do you want me to blog about another subject within ITIL? All feedback helps me to understand more. Thankyou.