Tips for Taking Charge of a SysAdmin Team

Sunday, January 16, 2011

Jamie Adams

4085079c6fe0be2fd371ddbac0c3e7db

Over the years I have worked in various roles but the most challenging has been when I assumed control of an already established system administration team.

Several times, a former colleague moved to a new company and inherited a system administration organization that needed some improvements.

I am proud to say that those colleagues convinced me to work for them again to overhaul the system administration operations.

Each of us has our own leadership style and everyone's approach may differ from organization to organization; nonetheless, I wanted write about my approach. First of all, I have a handful of generic questions I ask:

  • What am I responsible for?
  • Does the architecture make sense? Are there any overlaps of responsibility?
  • Who has access to the assets I am responsible for?
  • What are we doing to maintain system availability? Backups?
  • What is our configuration management process? How fast do we turn up systems and introduce new applications?
  • What are the strengths and weaknesses of my team?
  • What are we missing?

When I enter the organization, I don't immediately request access to the systems but rather I begin going over architectural diagrams, operational procedures, and just peering over the shoulders of the system administrators. If the aforementioned documents are not present, then we have a problem.

It is imperative that there is a clear understanding of the system components within the architecture. I would immediately have the team begin compiling diagrams and work flows so that we can understand the system's architecture.

This would include high-level diagrams as well as a detailed asset management inventory of EVERY host. I want to see every host, its operating system version, and respective application versions (e.g., Tomcat, Oracle, Apache).

I would also require networking diagrams and a mapping of each system component to a particular organizational group. For example, a particular database contains billing information and it is used by group XYZ.

Who maintains that database schema and the software which manages the data? I would begin mapping those groups to system components so my team has a clear understanding what organizations they are supporting.

Next, I would set out to change the password of every privileged account in the system. This is where some people become upset, but remind yourself it is for the best.

As we make changes to the system to stabilize it or improve performance, we need to know exactly what changes were made. This password changing step is imperative in order to clearly identify who has access to the system.

I would ensure that every system is logging and auditing accordingly so you can see who is attempting and gaining access to privileged accounts. Furthermore, I would no longer allow any privileged account (e.g., root and oracle) to be logged into directly.

Secondly, I would start changing the root password on every system and only give it out to a select few senior system administrators. We can sort out sudo access for junior administrators as we move forward. Next, I would work with the lead database administrator and have the appropriate account passwords changed.

As people start complaining they no longer have access, we will evaluate each individual's role and determine if they truly need access. Too often system developers have root access to production systems. If developers need access to production systems, then, in my opinion, the application isn't ready for production.

Of course, while reviewing these critical system and application accounts the system administrator accounts should also be reviewed. Sometimes, you will find accounts for individuals who are no longer employed so it should be removed immediately.

I would also set password aging on system administrator accounts so that unused accounts are locked. This will help identify dormant accounts.

Once I have narrowed down the systems I am responsible for and who has access to them, I will closely examine the architecture and processes to ensure business continuity. Are we doing backups? Do we have redundant or mirrored storage solutions? How often do we test fail over and recovery procedures?

By this time, you should already have a good understanding of the existing change management processes. If it is insufficient, then lobby to get it fixed! Do you have a high provisioning rate? In other words, are new systems routinely being inserted into production with little-to-no testing?

Understanding the team is critical. You're always going to find an eclectic blend of personalities and talent in a group of system administrators. It's your job as a leader to determine who really “knows their stuff” and who has everyone bamboozled. Find out who the information hoarders are and those who have been stuck in a role they aren't happy in.

Lastly, you might stumble across some strange component in the architecture that is either antiquated or is just completely different from everything else. For example, the architecture is comprised of 99% Red Hat Linux and you have one HP-UX box running one small application and none of the system administrators knows anything about it.

My first question is how did it get into the architecture? What's the long term plan to maintain and support it? Will the team get any training on it? Or is this one of those situations where someone outside of the operations group has been granted exclusive root access to the system? [Cringe]

In the end, it is really your experience and leadership which can help to improve a system administration team. It has been my experience that having a clear picture of the environment you are expected to build and maintain is a critical first step to ensuring the success of the team.

Possibly Related Articles:
3701
Policy
Management Access Control Root Accounts Administration Systems
Post Rating I Like this!
7d9b4d25c34f53aaba05695d7797c607
cyberbofh Interesting article.
Showing how to gain insight about who does what and why or.. making it sure.

But..
(there always is a but, is there?)

Even when I understand you want to focus first on your areas of responsibility and then on backups, I don't agree with that.
Remember the golden rule in IT? "Backup, backup, backup!"

You want to implement changes, but the only good way to do that is with working backups!
Even when you backup "too much" (is that even possible?), at first. Make sure you have working backups before you change anything, because the impact of changes (root passwords, etc etc) is heavy..
1295255333
The views expressed in this post are the opinions of the Infosec Island member that posted this content. Infosec Island is not responsible for the content or messaging of this post.

Unauthorized reproduction of this article (in part or in whole) is prohibited without the express written permission of Infosec Island and the Infosec Island member that posted this content--this includes using our RSS feed for any purpose other than personal use.