In pursuant to HR Policy HR-025, all non-work related material has been removed from this blog.
"Wins" for the month of September:
1. First up, the University and Medical Center now share a unified view of RFC1918 addresses, the reverse space for the Medical Center networks, and the foward zone for mc.vanderbilt.edu internally to the Vanderbilt community. This resolves a multitude of issues where data between the two organizations were out of sync causing conflicting name resolutions. Additionally, this supports the new secure relay for servers email implementation by providing proper reverse resolution for both VU and VUMC.
2. Additional departments were trained for self-serve IPAM and DNS. The Owen Graduate School of Management & the Vanderbilt University Law School now are empowered to administer their own IP and DNS space.
3. The Diamond IP environment was upgraded to to version 3.0.71 resolving some serious memory leaks present in the earlier version.
4. DHCP migrations from NetID to Diamond IP continue with the current migrations at 90% completed. This puts us well on track to have the NetID environment retired in November. That covers the major events of the month.
1. BIND Views are finally working in DIP on the primary name server. There is still an issue with getting the views propagated to slave servers. Following the ISC instructions does not work and the vendor has been engaged.
2. The issue with pushing updates in the DIP environment has been resolved. Apparently ActiveMQ was refusing to play nicely. This paves the way for upgrades to the Sapphire Appliances and to the application.
3. Disk consolidation continues in the Virtualization environment. All "troubled" LUNs have been replaced. Additionally, prep work to retire the AMD ESX servers continue. When all is said and done, the ESX environment will drop from 20 hosts down to 12 hosts.
4. Work continues on resolving backup issues with a number of hosts moving off the .1 network to the Admin Network. There are still a number of hosts that needs this addressed.
1. DNS/DHCP
- The Diamond IP environment received an upgrade to 3.0.62 in a hope to solve some issues with zone publishing. While the software itself is stable, the problem was not solved. Oh well…
- The remaining 4 personnel in Application Hosting received their training on Self-Serve DNS/IPAM with the Diamond IP InControl software. I think everyone on both the AppHosting as well as the ND&E team can agree that this is a definite Win for both teams.
- RFC1918 subnets continue to be imported into the Diamond IP environment. Difficulties in DHCP failover w/ the supplied DHCP 3.0.6 version from BT INS keep us from really diving into migration of DHCP enabled subnets.
- I have exhausted my ideas on trying to get replicated DNS BIND Views implemented without using 72 hours of imports or significant name resolution service downtime. I escalated to BT INS but have yet to get an solid answer back from them.
- DNS survived the Great Power Outage of 08. Sure, service was degraded a bit with the Master down, but service never dropped completely off. YAY!
Next up… getting those Views implemented, NCS Self-Serve DNS Training, and Sapphire 3.0.72 upgrades.
2. The Virtual Environment
- Virtual Center upgraded to VC2.5-Update 3 – Kendra knocked it out of the park. Absolutely HAMMERED it. What an awesome job by her. I’m still wondering how some of the upgrade bugs escaped the VMware QA lab. We ran into the issue of vxpa corrupting on 3.0.2 hosts w/ VCMS 2.5u3. Took me a good portion of the night to figure out what was going on and how to fix it. While it took some time and effort, the capabilities now offerred with VC2.5u3 with ESX 3.5u3 have made our life soooo much better. And speaking of ESX 3.5u3….
- ESX Upgrades from 3.0.2 to 3.5 Update 3 – While it is not 100% complete (the prod AMD clusters remain to be upgraded), I am going to call this a WIN as over 1/2 of the total environment is upgraded and working extremely well. Storage VMotion has enabled us to FINALLY perform some much needed SAN consolidation. I also happen to love the new Health Condition report within the VI Client. I would love to say this upgrade was totally without downtime, but it was not meant to be so… of course, the downtime was pretty much our own fault. Putting servers on local storage, lack of VMware Tools, etc. Great job to Kendra and Scott E. for stepping up to help do the upgrades. BIG thanks!
- The Leviathan was pushed into production in a rushed manner to make up for the broken snapshots w/ the VCMS upgrade. Thanks to Kenon for the quick weekend work to get the storage presented and help get the service up and running. On a more positive note, it did push me rather forcefully into figuring out all the tricks with ESX 3.5u3 and well as getting the plugins working for VCMS. Nothing like a little pressure to make learning so much more satisfying.
3. RHN Upgrade
- RHN 5.1.1 has been pushed out the door and into production. 97 of the former ITS clients are re-registered and I hope to get the other departments to finally buy into what RHN Satellite can offer them in terms of ease of deployment via kickstarts, activation keys for RHEL5, easier/quicker patching, and a view into the health of their RHEL environments.
- RHN 5.2 was FINALLY released as well. It came a bit too soon to the 5.1.1 production date to put too much effort into it, but that is up on the slate soon. With Oracle 10g support (FINALLY), we can move this database to our existing, more robust clusters and gain some performance.
- The re-registration scripts worked for the most part and made it quite easy to register. Scripts are available for anyone in the Vanderbilt Community to take advantage of this service
Quite simple actually…
First, you need to get the HBA’s to issue a LIP and then a re-scan
- echo 1 > /sys/class/fc_host/<host #>/issue_lip
- echo "- – -" > /sys/class/scsi_host/<host #>/scan
Do this for every host path.
Now you just need to tell PowerPath to go do its normal discovery
If you do a display, you should see the new LUN.
That’s all there is to it…. go forth and fdisk!
1. DNS/DHCP
- Initial imports of NetID data into the Production Diamond IP environment was initiated. There has been some hiccups along the way with records having old hostnames with different associated DNS records. The update to the record caused some of the resource records to be changed back to the original names. This was responsible for some outages in the web environment as name resolution between the front-end and back-ends was broken.
- With the inital import otherwise successful, the first of many trainings to allow for Self-Server IPAM/DNS was condicted with the Application Hosting group. Training was very successful and the AppHost team has taken to IPAM/DNS better than I hoped. The first "outside of ITS" training started the last week of Oct with MIS up to bat first. All signs also point to successful training and acceptance. Make up training for remaining App Host personnel, as well as extended invitations to the rest of the DNS/DHCP team, has been scheduled for early November.
- Documentation on the Diamond IP environment was completed and disseminated. Anyone should be able to grab the documents and conduct failover or installation/configuration.
- External authorization against LDAPS was finally completed paving the way for external Self-Serve IPAM/DNS.
- There was some outages with the DNS environement caused by some issues with the firewall. After a weekend of stress, some preventive measures were put into place to prevent this from occuring again.
2. Virtual Environment
- The inital attempt to upgrade VCMS to 2.5u3 failed. The database/application was successfully rolled back to the previous version. The next attempt to upgrade will come during the first week of November. Kudos and shouts out to Kendra Thorpe for not only taking the lead on this upgrade, but kicking butt and seeing it through.
- The Leviathan arrived and was racked up. This server should be in the cluster prior to VCMS upgrade. This will allow us to consolidate some virtual data center and use those assets to extend our capacity in some other virtual data centers.
- An evaluation of VKernel’s Virtual Capacity Analyzer was conducted and immediately paid some dividends in clearing up some misconceptions with the Sharepoint environment. The evaluation was entirely too short but I hope to get an extension from VKernel soon. If you have not seen this product, I would recommend taking a peek at what it can do. Check it out at VKernel – http://www.vkernel.com/products/CapacityAnalyzer/
- An open-minded evaluation was performed on Microsoft’s Hyper-V. Yes, this *NIX guy actually looked at a MS product without immediately dismissing it. I found that it is an great first attempt to seriously take on VMware but still falls short in many areas. I expect MS to push plenty of resources at making the following versions much more robust and capable. I would definitely recommend it for a mostly Windows test environment.
3. RHN
- After a vulnerability was discovered in our old RHN environment, there was added emphasis place into getting RHN 5.1.2 up and running. Re-registration scripts mostly generated and this environment should be active before Thanksgiving.
- We are also taking this opportunity to upgrade the Oracle back-end as well. I cannot wait until RHN 5.2 comes out so we can leverage our more "modern" database environment with RHN and get off these older Oracle versions.
4. MISC other stuff
- I had the opportunity to evaluate some Apple products this month. The iPhone and a Macbook Pro managed to sneak their way onto my desk and I decided to drive Steve Job’s kool-aid for a couple of weeks. I will have to say that getting the Macbook Pro to triple boot Mac OS X, Vista Business, and Ubuntu 8.04 took some effort but, in the end, I don’t think I have been more pleased with any laptop before as I have been with this. Combine a little VMware Fusion within Mac OS X for running Vista in Unity mode and there is really no reason NOT to own one of these. Now, if I could only convince management to let me get a 17" Pro with 4 gigs of RAM, I would be as pleased as anyone could be.
- The iPhone was a bit more of a mixed reception. I must admit that 802.11g was nice as was having a REAL browser in a phone. What really impressed me was the Exchange integration. WOW! I don’t think I have ever been as pleased with a phone based mail client for the enterprise as I was with this product. I REALLY liked it and now look at my Treo 700wx with pure hatred. A couple things did really annoy me though… I can’t believe Apple totally missed the ball on voice dialing. In this age of "OMG MUST BE HANDS FREE PHONE," having to touch a screen to dial is pure… archaric. Strange for such a forward thinking device. Also, I totally blame this on the 1st gen iPhone and AT&T’s lousy EDGE network, service coverage for data services was abyssmal here in Nashville. If I had to get a first gen iPhone, I would pass just for that reason. I hear that the 3G network has greater coverage and I would like to find out if that is true. Lastly, I don’t care what people say… touch screen typing is not made for fat fingers. A slide out keyboad ala Android would make this a absolute WIN in my book. That said…. I miss that phone and wish I had it back instead of this lousy Windows Mobile 5.1 Treo 700wx. I really REALLY miss it compared to this brick.
Next month is BIND Views and ESX Upgrade Frenzy. There is definitely no let-up heading into the holiday season!
1. Patching fun – Solaris & Linux servers received their patches
2. DNS/DHCP – The production Diamond IP environment was patched to Sapphire 3.0.58 and IPControl 3.0.53.
3. ESX Stabilization – Switch diversification was completed preventing a repeat of the Aug outages. DRS rules were added to prevent failover VM’s from running on the same ESX host.
4. Oracle Bug – A bug in Oracle was patched preventing system logging from bringing down the databases.